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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
Filing under 35 U.S.C. § 371 

Corresponding to International Application Serial No.: 
PCT/FR97/00214 

Applicants: CAPUT, Daniel, FERRARA, Bernard and 
KAGHAD, Mourad 

International Filing Date: February 3, 1997 



For: PURIFIED SR-p70 PROTEIN 

Assistant Commissioner for Patents 
Box PCT 
Attn: EO/US 
Washington, D.C. 20231 

Dear Sir: 

PRELIMINARY AMENDMENT 

Please amend the above-identified application as follows: 
In the Claims 

Please amend Claims 1-36 and add Claims 37 and 38 as follows before calculating the filing 
fee for the above-identified application: 

1. (Amended) A [Purified] purified polypeptide, comprising an amino acid sequence 
selected from the group consisting of: 

a) [the] sequence SEQ ID No. 2; 

b) [the] sequence SEQ ID No. 4; 

c) [the] sequence SEQ ID No. 6; 

d) [the] sequence SEQ ID No. 8; 

e) [the] sequence SEQ ID No. 10; 

f) [the] sequence SEQ ID No. 13; 

g) [the] sequence SEQ ID No. 15; 

h) [the] sequence SEQ ID No. 17; 

i) [the] sequence SEQ ID No. 19; and 
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j) any biologically active sequence derived from SEQ ID No. 2, SEQ ID No. 4, 
SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 15, 
SEQ ID No. 17 or SEQ ID No. 19. 

2. (Amended) A [Polypeptide] polypeptide according to Claim 1, [characterized in that 
it] [comprises] comprising [the] an amino acid sequence selected from the group consisting of 
SEQ ID No. 6, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 and SEQ ID No. 19. 

3. (Amended) A [Polypeptide] polypeptide according to Claim 1, [characterized in that 
it comprises] comprising [the] a sequence lying between: 

- residue 1 1 0 and residue 3 1 0 of SEQ ID No. 2 or 6; 

- residue 60 and residue 260 of SEQ ID No. 8. 

4. (Amended) A [Polypeptide] polypeptide according to Claim 1, [characterized in that 
it] which [results] is produced from an alternative splicing of [the] messenger RNA of [the] a 
corresponding gene. 

5. (Amended) A [Polypeptide] polypeptide according to [any one of the preceding 
claims,] Claim 1 [characterized in that it] that is a recombinant polypeptide produced in the 
form of a fusion protein. 

6. (Amended) An [Isolated] isolated nucleic acid sequence coding for a polypeptide 
according to [any one of the preceding claims] Claim 1 . 

7. (Amended) An [Isolated] isolated nucleic acid sequence according to Claim 6, 
[characterized in that it is] said nucleic acid having a sequence selected from the group 
consisting of: 

a) [the] sequence SEQ ID No. 1; 

b) [the] sequence SEQ ID No. 3; 

c) [the] sequence SEQ ID No. 5; 

d) [the] sequence SEQ ID No. 7; 

e) [the] sequence SEQ ID No. 9; 



-2- 



Docket No. IVD 913 



f) [the] sequence SEQ ID No. 1 1 ; 

g) [the] sequence SEQ ID No. 12; 

h) [the] sequence SEQ ID No. 14; 

i) [the] sequence SEQ ID No. 16; 
j) [the] sequence SEQ ID No. 18; 

k)[the] nucleic acid sequences capable of hybridizing specifically with [the] sequence 
SEQ ED No. 1, SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No. 9, SEQ ID 
No. 11, SEQ ID No. 12, SEQ ID No. 14, SEQ ID No. 16 or SEQ ED No. 18 or with 
[the] sequences complementary to them, or of hybridizing specifically with their 
proximal sequences; and 
1) [the] sequences derived from the sequences a), b), c), d), e), f), g), h), i), j) or k) as a 
result of the degeneracy of the genetic code, mutation, deletion, insertion, and 
alternative splicing or an allelic variability. 

8. (Amended) A [Nucleotide] nucleotide sequence according to Claim 6, [characterized 
in that it is a sequence] selected from the group consisting of SEP ID No. 5, SEQ ID No. 12, 
SEQ ED No. 14, SEQ ID No. 16 and SEQ ED No. 18 and coding, respectively, for the 
polypeptide of sequences SEQ ID No. 6, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 and 
SEQ ED No. 19. 

9. (Amended) A [Cloning] cloning and/or expression vector containing a nucleic acid 
sequence according to [any one of Claims] Claim 6 [to 8]. 

10. (Amended) A [Vector] vector, according to Claim 9, [characterized in that it] which 
is [the] plasmid pSEl. 

1 1 . (Amended) A [Host] host cell transfected by a vector according to Claim 9 [or 10]. 

12. (Amended) A [Transfected] transfected host cell* according to Claim 11, 
[characterized in that it] which is E. coli MC 1061. 
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13. (Amended) A [Nucleotide] nucleotide probe or nucleotide primer [, characterized in 
that it] which hybridizes specifically with [any one of the sequences according to Claims] the 
nucleic acid of Claim 6 [to 8] or [the] a nucleic acid having sequences complementary to them 
or [the corresponding] messenger RNAs corresponding to them or [the corresponding] genes 
corresponding to them . 

14. (Amended) A [Probe] probe or primer according to Claim 13[, characterized in] that 
[it] contains at least 16 nucleotides. 

15. (Amended) A [Probe] probe or primer according to Claim 13 [characterized in that 
it] that comprises the whole of the sequence of the gene coding for [one of the polypeptides of 
Claim 1] a polypeptide, wherein said polypeptide comprises an amino acid sequence selected 
from the group consisting of: 



a) sequence SEP ID No. 2: 

b) sequence SEP ID No. 4 ; 

c) sequence SEQ ID No. 6; 

d) sequence SEP ID No. 8: 
e^ sequence SEP ID No. 10: 
ft sequence SEP ID No. 1 3 : 

g) sequence SEP ID No. 15: 

h) sequence SEP ID No. 17: 

i) sequence SEP ID No. 1 9: and 

j) any biologically active sequence derived from SEQ ID No. 2. S EP ID No. 4. 
SEP ID No. 6. SEP ID No. 8. SEP TP No. 10. SEP TP No. 13. SEP ID No. 15. 
SEP ID No. 17 or SEQ ID No. 19 . 



16.(Amended) A [Nucleotide] nucleotide probe or primer selected from the group 
consisting of the following oligonucleotides or sequences complementary to them: 



SEQ ID No. 20: GCG AGC TGC CCT CGG AG 
SEQ ID No. 21: GGT TCT GCA GGT GAC TCA G 
SEQ ID No. 22: GCC ATG CCT GTC TAC AAG 
SEQ ID No. 23: ACC AGC TGG TTG ACG GAG 
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SEQ ID No. 24: GTC AAC CAG CTG GTG GGC CAG 
SEQ ID No. 25: GTG GAT CTC GGC CTC C 
SEQ ID No. 26: AGG CCG GCG TGG GGA AG 
SEQ ID No. 27: CTT GGC GAT CTG GCA GTA G 
SEQ ID No. 28: GCG GCC ACG ACC GTG AC 
SEQ ID No. 29: GGC AGC TTG GGT CTC TGG 
SEQ ID No. 30: CTG TAC GTC GGT GAC CCC 
SEQ ID No. 31: TCA GTG GAT CTC GGC CTC 
SEQ ID No. 32: AGG GGA CGC AGC GAA ACC 
SEQ ID No. 33: CCA TCA GCT CCA GGC TCT C 
SEQ ID No. 34: CCA GGA CAG GCG CAG ATG 
SEQ ID No. 35: GAT GAG GTG GCT GGC TGG A 
SEQ ID No. 36: TGG TCA GGT TCT GCA GGT G 
SEQ ID No. 37: CAC CTA CTC CAG GGA TGC 
SEQ ID No. 38: AGG AAA ATA GAA GCG TCA GTC 
SEQ ID No. 39: CAG GCC CAC TTG CCT GCC 
and SEQ ID No. 40: CTG TCC CCA AGC TGA TGA G 



17. (Amended) The [Use] use of a sequence according to [any one of Claims] Claim 6 
[to 8,] for the manufacture of oligonucleotide primers for sequencing reactions or specific 
amplification reactions according to the PCR technique or any variant of the latter. 

18. (Amended) A [Nucleotide] nucleotide primer pair[, characterized in that it 
comprises] comprising [the] primers selected from the group consisting of the following 
sequences: 

a) sense primer: GCG AGC TGC CCT CGG AG (SEQ ID No. 20) 
antisense primer: GGT TCT GCA GGT GAC TCA G (SEQ ID No. 21) 

b) sense primer: GCC ATG CCT GTC TAC AAG (SEQ ID No. 22) 
antisense primer: ACC AGC TGG TTG ACG GAG (SEQ ID No. 23) 

c) sense primer: GTC AAC CAG CTG GTG GGC CAG (SEQ ID No. 24) 
antisense primer: GTG GAT CTC GGC CTC C (SEQ ID No. 25) 

d) sense primer: AGG CCG GCG TGG GGA AG (SEQ ID No. 26) 
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antisense primer: CTT GGC GAT CTG GCA GTA G (SEQ ID No. 27) 

e) sense primer: GCG GCC ACG ACC GTG A (SEQ ID No. 28) 
antisense primer: GGC AGC TTG GGT CTC TGG (SEQ ID No. 29) 

f) sense primer: CTG TAC GTC GGT GAC CCC (SEQ ID No. 30) 
antisense primer: TCA GTG GAT CTC GGC CTC (SEQ ID No. 31) 

g) sense primer: AGG GGA CGC AGC GAA ACC (SEQ ID No. 32) 
antisense primer: GGC AGC TTG GGT CTC TGG (SEQ ID No. 29) 

h) sense primer: CCCCCCCCCCCCCCN (where N equals G, A or T) 
antisense primer: CCA TCA GCT CCA GGC TCT C (SEQ ID No. 33) 

i) sense primer: CCCCCCCCCCCCCCN (where N equals G, A or T) 
antisense primer: CCA GGA CAG GCG CAG ATG (SEQ ID No. 34) 

j) sense primer: CCCCCCCCCCCCCCCN (where N equals G, A or T) 

antisense primer: CTT GGC GAT CTG GCA GTA G (SEQ ID No. 27) 

k) sense primer: CAC CTA CTC CAG GGA TGC (SEQ ID No. 37) 

antisense primer: AGG AAA ATA GAA GCG TCA GTC (SEQ ID No. 38) and 

1) sense primer: CAG GCC CAC TTG CCT GCC (SEQ ID No. 39) 

antisense primer: CTG TCC CCA AGC TGA TGA G (SEQ ID No. 40) 

19. (Amended) The [Use] use of a sequence according to [any one of Claims] Claim 6 
[to 8,] [which is usable] in gene therapy. 

20. (Amended) The [Use] use of a sequence according to [any one of Claims] Claim 6 
[to 8,] for the production of diagnostic nucleotide probes or primers, or of antisense sequences 
which are usable in gene therapy. 

21. (Amended) The [Use] use of nucleotide primers according to [any one of Claims] 
Claim 6 [to 8,] for sequencing. 

22. (Amended) The [Use] use of a probe or primer according to [any one of Claims] 
Claim 13 [to 16,] as an in vitro diagnostic tool for the detection, by hybridization experiments, 
of nucleic acid sequences coding for a polypeptide , wherein said polypeptide comprises an 
a mino acid sequence selected from the group consisting of: 
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a) sequence SEP ID No. 2: 
sequence SEP TP No. 4: 
c) sequence SEP ID No. 6: 
cT> sequence SEP ID No. ft: 

e) sequence SEP ID No. 1 0: 

f) sequence SEP ID No. 1 3: 

g) sequence SEP ID No. 1 5: 

h) sequence SEP ID No. 1 7: 

i) sequence SEP ID No. 1 9: and 

j) any biologically active sequence derived from SEP ID No. 2. SEP ID No. 4. 
SEP ID No. 6. SEP ID No. 8. SEP TP No. 10. SEP ID No. 13. SE P ID No. 15. 
SEP ID No. 17 or SEP ID No. 19 [according to any one of Claims 1 to 4,] in 
biological samples, or for the demonstration of aberrant syntheses or of genetic 
abnormalities. 



23. (Amended) A [Method] method of in vitro diagnosis for the detection of aberrant 
syntheses or of genetic abnormalities in the nucleic acid sequences coding for a polypeptide, 
wherein said polypeptide comprises an amino acid sequence selected from the group consisting 
of: 

a^ sequence SEP ID No. 2: 

b) sequence SEP ID No. 4: 

c) sequenc e SEQ ID No. 6; 
sequence SEP ID No. 8: 

e^ sequence SEP ID No. 10: 

f) sequence SEP ID No. 13: 

g) sequence SEP ID No. 15: 
If) sequence SEP ID No. 17: 

i) sequence SEP ID No. 19: and 

j) any biologically active sequence derived from SEP ID No. 2. SEQ ID No. 4. 
15. SEP ID No. 17 or SEP ID No. 19 
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[according to any one of Claims 1 to 4, characterized in that it comprises] comprising the step s 
of: 

- [the] bringing of a nucleotide probe according to [any one of Claims] Claim 13 [to 
16] into contact with a biological sample under conditions permitting the formation 
of a hybridization complex between the [said] probe and the [abovementioned] 
nucleotide sequence, where appropriate after a prior step of amplification of the 
[abovementioned ]nucleotide sequence; 

the detection of the hybridization complex [possibly] formed; and 

where appropriate, [the] sequencing of the hybridization complex' nucleotide 

sequence [forming the hybridization complex] with the probe of the invention. 

24. (Amended) The [Use] use of a nucleic acid sequence according to [any one of 
Claims] Claim 6 [to 8,] for the production of a recombinant polypeptide wherein said 
polypeptide comprises an amino acid sequence selected from the group consisting of: 

a) sequence SEP TP No. 2: 
b^ sequence SEP TP No. 4: 

c) sequence SEP IP No. 6: 

d) sequence SEP TP No. 8: 

e) sequence SEP TP No. 1 0: 

f) sequence SEP TP No. 13: 

g) sequence SEP TP No. 1 5: 

h) sequence SEP IP No. 1 7: 

f> sequence SEP IP No. 1 9: and 

any biologically active sequence derived from SEP IP No. 2. SEP ID No. 4. 
SEP ID No. 6. SEP TP No. 8. SEP TP No. 1 0. SEP ID No. 1 3. SEP ID No. 15. 
SEP ID No. 17 or SEP TP No. 1 9 [according to any one of Claims 1 to 5]. 

25. (Amended) A [Method] method of production of a recombinant SR-p70 protein, 
characterized in that transfected cells according to Claim [10 or] 11 are cultured under 
conditions permitting the expression of a recombinant polypeptide of sequence SEQ ID No. 2, 
SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 15, 
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SEQ ID No. 17 or SEQ ID No. 19 or any biologically active fragment or derivative, and in that 
the [said] recombinant polypeptide is recovered. 

26. (Amended) Mono- or polyclonal antibodies or their fragments, chimeric antibodies 
or immunoconjugates, characterized in that they are capable of specifically recognizing a 
polypeptide according to [any one of Claims] Claim 1 [to 4]. 

27. (Amended) Use of the antibodies according to [the preceding claim,] Claim 26 for 
the purification or detection of a polypeptide , wherein said polypeptide comprises an amino 
acid sequence selected from the group consisting of: 

a^ sequence SEP ID No. 2: 
b) sequence SEP ID No. 4: 
c> sequence SEP ID No. 6: 
d^ sequence SEP ID No. 8: 

e) sequence SEP ID No. 10: 

f) sequence SEP ID No. 13: 

g) sequence SEP ID No. 15: 

h) sequence SEP ID No. 17: 

i) sequence SEP ID No. 19: and 

j) any biologically active sequence derived from SEP ID No. 2. SEP ID No. 4. 
SEP ID No. 6. SEP ID No. 8. SEP ID No. 10. SEP ID No. 13. SEP ID No. 15. 
SEP ID No. 17 or SEP ID No. 19 [according to any one of Claims 1 to 4] in a 
biological sample. 

28. (Amended) A [Method] method of in vitro diagnosis of pathologies correlated with 
an expression or an abnormal accumulation of SR-p70 proteins, in particular the phenomena of 
carcinogenesis, from a biological sample, [characterized in that] comprising the steps of 
contacting at least one antibody according to Claim 25 [is brought into contact] with the said 
biological sample under conditions permitting the [possible] formation of specific 
immunological complexes between an SR-p70 protein and the said antibody or antibodies, and 
detecting the presence of [in that the] specific immunological complexes [possibly] formed [are 
detected]. 
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29. (Amended) A [Kit] kit for the in vitro diagnosis of an expression or an abnormal 
accumulation of SR-p70 proteins in a biological sample and/or for measuring the level of 
expression of these proteins in the said sample, comprising: 

at least one antibody according to Claim 25, optionally bound to a support, 
means of visualization of the formation of specific antigen-antibody complexes 
between an SR-p70 protein and the said antibody, and/or means of quantification of 
these complexes. 

30. (Amended) A [Method] method for the early diagnosis of tumour formation, 
[characterized in that] wherein autoantibodies directed against an SR-p70 protein are 
demonstrated in a serum sample drawn from an individual, according to the steps that [consist 
in] comprise bringing a serum sample drawn from an individual into contact with a polypeptide 
of the invention, optionally bound to a support, under conditions permitting the formation of 
specific immunological complexes between the said polypeptide and [the] autoantibodies 
[possibly] present in the serum sample, and in that the specific immunological complexes 
[possibly] formed are detected. 

31. (Amended) A [Method] method of determination of an allelic variability, a 
mutation, a deletion, an insertion, a loss of heterozygosity or a genetic abnormality of the SR- 
p70 gene, characterized in that it utilizes at least one nucleotide sequence according to [any one 
of Claims] Claim 6 [to 8]. 

32. (Amended) A [Method] method of determination of an allelic variability of the SR- 
p70 gene at position -30 and -20 relative to the initiation ATG of exon 2 which may be involved 
in pathologies[, and characterized in that it comprises at least] comprising : 

a step during which exon 2 of the SR-p70 gene carrying the target sequence is 
amplified by PCR using a pair of oligonuclotide primers according to [any one of 
Claims] Claim 6 [to 8]; 

a step during which the amplified products are treated with a restriction enzyme 
whose cleavage site corresponds to the allele sought and : 
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a step during which at least one of the products of the enzyme reaction is detected 
or assayed. 

33. (Amended) A [Pharmaceutical] pharmaceutical composition comprising an 
effective amount of [as active principle a] the polypeptide according to [any one of Claims] 
Claim 1 [to 4]. 

34. (Amended) A [Pharmaceutical] pharmaceutical composition according to [the 
preceding claim, characterized in that it comprises] Claim 33. comprising a polypeptide 
comprising an amino acid sequence selected from SEP ID No. 6. SEP ID No. 13. SEP ID No. 
15. SEP ID No. 17 and SEP ID No. 1 9 . 

35. (Amended) A [Pharmaceutical] pharmaceutical composition containing an inhibitor 
or an activator of SR-p70 activity. 

36. (Amended) A [Pharmaceutical] pharmaceutical composition containing a 
polypeptide derived from a polypeptide according to [any one of Claims] Claim 1 [to 5, 
characterized in that it] which is an inhibitor or an activator of SR-p70. 

Please add the following new claims. 

37. (New) The use of a probe or primer according to Claim 16 as an in vitro diagnostic 
tool for the detection, by hybridization experiments, of nucleic acid sequences coding for a 
polypeptide, wherein said polypeptide comprises an amino acid sequence selected from the 
group consisting of: 

a) sequence SEQ ID No. 2; 

b) sequence SEQ ID No. 4; 

c) sequence SEQ ID No. 6; 

d) sequence SEQ ID No. 8; 

e) sequence SEQ ID No. 10; 

f) sequence SEQ ID No. 13; 

g) sequence SEQ ID No. 15; 
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h) sequence SEQ ID No. 17; 

i) sequence SEQ ID No. 19; and 

j) any biologically active sequence derived from SEQ ID No. 2, SEQ ID No. 4, 
SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 
15, SEQ ID No. 17 or SEQ ID No. 19 in biological samples, or for the 
demonstration of aberrant syntheses or of genetic abnormalities. 

38. (New) A method of in vitro diagnosis for the detection of aberrant syntheses or of genetic 
abnormalities in the nucleic acid sequences coding for a polypeptide, said polypetide 
comprising an amino acid sequence selected from the group consisting of: 

a) sequence SEQ ID No. 2; 

b) sequence SEQ ID No. 4; 

c) sequence SEQ ID No. 6; 

d) sequence SEQ ID No. 8; 

e) sequence SEQ ID No. 10; 

f) sequence SEQ ID No. 13; 

g) sequence SEQ ID No. 15; 

h) sequence SEQ ID No. 17; 

i) sequence SEQ ID No. 19; and 

j) any biologically active sequence derived from SEQ ID No. 2, SEQ ID No. 4, 
SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 
15, SEQ ID No. 17 or SEQ ID No. 19 
comprising the steps of: 

bringing of a nucleotide probe according to Claim 16 into contact with a biological 
sample under conditions permitting the formation of a hybridization complex 
between the probe and the nucleotide sequence, where appropriate, after a prior step 
of amplification of the nucleotide sequence; 
the detection of the hybridization complex formed; and 

where appropriate, sequencing of the hybridization complex' nucleotide sequence 
with the probe of the invention. 
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REMARKS 

Claims 1-36 have been amended in order to limit the multiple dependencies of these 
claims and to present them in the appropriate U.S. claim format. 

New claims 37 and 38 have been added by the foregoing amendments. Support for 
these amendments can be found, for example, in original claims 22 and 23, wherein the subject 
matter now claimed is specifically set forth. 
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Abstract 



The invention relates to new nucleic acid sequences of 
the family of tumour -suppressing genes related to the 
gene for the p53 protein, and to the corresponding 
protein sequences . 



The invention relates to new nucleic acid 
sequences of the family of tumour- suppressing genes 
related to the gene for the p53 protein, and to the 
corresponding protein sequences. 

The invention also relates to the prophylactic, 
therapeutic and diagnostic applications of these 
sequences, in particular in the field of pathologies 
linked to the phenomena of apoptosis or of cell 
transformation. 

Tumour -suppressing genes perform a key role in 
protection against the phenomena of carcinogenesis, and 
any modification capable of bringing about the loss of 
one of these genes, its inactivation or its dysfunction 
may have oncogenic character, thereby creating favourable 
conditions for the development of a malignant tumour. 

The authors of the present invention have 
identified transcription products of a new gene, as well 
as the corresponding proteins. This gene, SR-p70, is 
related to the p53 tumour- suppressing gene, the 
antitumour activity of which is linked to its 
transcription factor activity, and more specifically to 
the controls exerted on the activity of the Bax and Bel -2 
genes which are instrumental in the mechanisms of cell 
death. 

Hence the present invention relates to purified 
SR-p7 0 proteins, or biologically active fragments of the 
latter. 

The invention also relates to isolated nucleic 
acid sequences coding for the said proteins or their 
biologically active fragments, and to specific 
oligonucleotides obtained from these sequences. 

It relates, in addition, to the cloning and/or 
expression vectors containing at least one of the 
nucleotide sequences defined above, and the host cells 
trans fee ted by these cloning and/or expression vectors 
under conditions permitting the replication and/or 
expression of one of the said nucleotide sequences. 

The methods of production of recombinant SR-p70 
proteins or their biologically active fragments by the 
transfected host cells also form part of the invention. 
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The invention also comprises antibodies or 
antibody derivatives specific for the proteins defined 
above . 

It relates, in addition, to methods of detection 
of cancers, either by measuring the accumulation of SR- 
p7 0 proteins in the tumours according to 
immunohistochemical techniques, or by demonstrating 
autoantibodies directed against these proteins in 
patients' serum. 

The invention also relates to any inhibitor or 
activator of SR-p70 activity, for example of protein- 
protein interaction, involving SR-p70. 

It also relates to antisense oligonucleotide 
sequences specific for the above nucleic acid sequences, 
capable of modulating in vivo the expression of the SR- 
p7 0 gene. 

Lastly, the invention comprises a method of gene 
therapy, in which vectors such as, for example, 
inactivated viral vectors capable of transferring coding 
sequences for a protein according to the invention are 
injected into cells deficient for this protein, for 
purposes of regulating the phenomena of apoptosis or of 
reversion of transformation. 

A subject of the present invention is a purified 
polypeptide comprising an amino acid sequence selected 
from: 

a) the sequence SEQ ID No. 2; 

b) the sequence SEQ ID No. 4; 

c) the sequence SEQ ID No. 6; 

d) the sequence SEQ ID No. 8; 

e) the sequence SEQ ID No. 10; 

f) the sequence SEQ ID No. 13; 

g) the sequence SEQ ID No. 15; 

h) the sequence SEQ ID No. 17; 

i) the sequence SEQ ID No. 19; 

j) any biologically active sequence derived from 
SEQ ID No. 2, SEQ ID No . 4, SEQ ID No. 6, SEQ ID No . 8, 
SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 
17 or SEQ ID No. 19. 



In the description of the invention, the 
following definitions are used: 

- SR-p70 protein: a polypeptide comprising an 
amino acid sequence selected from the sequences SEQ ID 

5 No. 2, SEQ ID No. 4, SEQ ID No . 6, SEQ ID No . 8, SEQ ID 
No. 10, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 or 
SEQ ID No. 19, or any biologically active fragment or 
derivative of this polypeptide; 

derivative: any variant polypeptide of the 

10 polypeptide of sequence SEQ ID No. 2, SEQ ID No. 4, SEQ 
ID No. S, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ 
ID No. 15, SEQ ID No. 17 or SEQ ID No. 19, or any 
molecule resulting from a modification of a genetic 
and/or chemical nature of the sequence SEQ ID No. 2, SEQ 

15 ID No. 4, SEQ ID No . 6, SEQ ID No. 8, SEQ ID No. 10, SEQ 
ID No. 13, SEQ ID No. 15, SEQ ID No. 17 or SEQ ID No. 19, 
that is to say obtained by mutation, deletion, addition, 
substitution and/or chemical modification of a single 
amino acid or of a limited number of amino acids, as well 

2 0 as any isoform sequence, that is to say sequence 
identical to the sequence SEQ ID No. 2, SEQ ID No. 4, SEQ 
ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No, 13, SEQ 
ID No. 15, SEQ ID No. 17 or SEQ ID No. 19, or to one of 
its fragments or modified sequences, containing one or 

2 5 more amino acids in the form of the D enantiomer, the 

said variant, modified or isoform sequences having 
retained at least one of the properties that make them 
biologically active; 

- biologically active: capable of binding to DNA 

3 0 and/or of exerting transcription factor activity and/or 

of participating in the control of the cell cycle, of 
differentiation and of apoptosis and/or capable of being 
recognized by the antibodies specific for the polypeptide 
of sequence SEQ ID No. 2, SEQ ID No. 4, SEQ ID No . 6, SEQ 
3 5 ID No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 15, 
SEQ ID No. 17 or SEQ ID No. 19, and/or capable of 
inducing antibodies which recognize this polypeptide. 

The manufacture of derivatives may have different 
objectives, including especially that of increasing the 



affinity of the polypeptide for DNA or its transcription 
factor activity, and that of improving its levels of 
production, of increasing its resistance to proteases, of 
modifying its biological activities or of endowing it 
5 with new pharmaceutical and/or biological properties. 

Among the polypeptides of the invention, the 
polypeptide of human origin comprising the sequence SEQ 
ID No. 6, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 or 
SEQ ID No. 19 is preferred. The polypeptide of 636 amino 
10 acids corresponding to the sequence SEQ ID No. 6 is more 
than 97% identical to the polypeptide of sequence SEQ ID 
No. 2. 

The polypeptide of sequence SEQ ID No. 2 and that 
of sequence SEQ ID No. 4 are two expression products of 
15 the same gene, and the same applies to the sequences SEQ 
ID No. 8 and SEQ ID No. 10 and to the sequences SEQ ID 
No. 6, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 or SEQ 
ID No. 19 . 

As will be explained in the examples, the 
20 polypeptide of sequence SEQ ID No. 4 corresponds to a 
premature termination of the peptide of sequence SEQ ID 
No. 2, linked to an alternative splicing of the longer 
transcript (messenger RNA) , coding for the polypeptide of 
SEQ ID No. 2, of the corresponding gene. Similarly, in 

2 5 humans, the polypeptides corresponding to the sequences 

SEQ ID No. 6, SEQ ID No. 13, SEQ ID No. 15, SEQ ID No. 17 
and SEQ ID No. 19, diverge in their composition in 
respect of the N- and/or C- terminal portions, this being 
the outcome of alternative splicing of the same primary 

3 0 transcript. The N- terminal peptide sequence of the 

sequence SEQ ID No. 10 is deleted, this being linked to 
an alternative splicing of its coding transcript. 

Advantageously, the invention relates to a 
polypeptide corresponding to the DNA binding domain of 
3 5 one of the above polypeptides. 

This domain corresponds to the sequence lying 
between residue 110 and residue 310 for the sequences SEQ 
ID No. 2 or 6, and between residue 60 and residue 260 for 
the sequence SEQ ID No . 8. 



A subject of the present invention is also 
nucleic acid sequences coding for a SR-p70 protein or 
biologically active fragments or derivatives of the 
latter. 

More preferably, a subject of the invention is an 
isolated nucleic acid sequence selected from: 



a) 


the 


sequence 


SEQ 


ID 


No 


. 1; 


b) 


the 


sequence 


SEQ 


ID 


No 


. 3; 


c) 


the 


sequence 


SEQ 


ID 


No 


. 5; 


d) 


the 


sequence 


SEQ 


ID 


No 


. 7; 


e) 


the 


sequence 


SEQ 


ID 


NO 


. 9; 


f) 


the 


sequence 


SEQ 


ID 


No 


. 11; 


g) 


the 


sequence 


SEQ 


ID 


No 


. 12; 


h) 


the 


sequence 


SEQ 


ID 


No 


. 14; 


i) 


the 


sequence 


SEQ 


ID 


No 


. 16; 


j) 


the 


sequence 


SEQ 


ID 


No 


. 18; 


JO 


the nucleic acid 


seque 



capable of 

hybridizing specifically with the sequence SEQ ID No. 1, 
SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No . 9, 
SEQ ID No. 11, SEQ ID No. 12, SEQ ID No. 14, SEQ ID No. 
16 or SEQ ID No. 18 or with the sequences complementary 
to them, or of hybridizing specifically with their 
proximal sequences; 

1) the .sequences derived from the sequences a) , 
b) , c) , d) , e) , f ) , g) , h) , i) , j) or k) as a result of 
the degeneracy of the genetic code. 

According to a preferred embodiment, a subject of 
the invention is nucleotide sequences SEQ ID No. 5, SEQ 
ID No. 12, SEQ ID No. 14, SEQ ID No. 16 and SEQ ID No. 
18, corresponding, respectively, to the cDNAs of the 
human proteins of the sequences SEQ ID No. 6, SEQ ID No. 

13, SEQ ID No. 15, SEQ ID No. 17 and SEQ ID No. 19. 

The different nucleotide sequences of the 
invention may be of artificial origin or otherwise. They 
can be DNA or RNA sequences obtained by the screening of 
libraries of sequences by means of probes prepared on the 
basis of the sequences SEQ ID No. 1, 3, 5, 7, 9, 11, 12, 

14, 16 or 18. Such libraries may be prepared by 
traditional techniques of molecular biology which are 
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known to a person skilled in the art. 

The nucleotide sequences according to the 
invention may also be prepared by chemical synthesis, or 
alternatively by mixed methods including the chemical or 
5 enzymatic modification of sequences obtained by the 
screening of libraries. 

These nucleotide sequences enable nucleotide 
probes to be produced which are capable of hybridizing 
strongly and specifically with a nucleic acid sequence, 

10 of a genomic DNA or of a messenger RNA, coding for a 
polypeptide according to the invention or a biologically 
active fragment of the latter. Such probes also form part 
of the invention. They may be used as an in vitro 
diagnostic tool for the detection, by hybridization 

15 experiments, of transcripts specific for the polypeptides 
of the invention in biological samples, or for the 
demonstration of aberrant syntheses or of genetic 
abnormalities such as loss of heterozygosity or genetic 
rearrangement resulting from a polymorphism, from 

2 0 mutations or from a different splicing. 

The probes of the invention contain at least 10 
nucleotides, and contain at most the whole of the 
sequence of the SR-p70 gene or of its cDNA contained, for 
example, in a cosmid. 
25 Among the shortest probes, that is to say of 

approximately 10 to 2 0 nucleotides, the appropriate 
hybridization conditions correspond to the stringent 
conditions normally used by a person skilled in the art. 

The temperature used is preferably between 

3 0 T a -5°C and T m -30°C, and as a further preference between 

T a -5°C and T m -10 a C, T a being the melting temperature, 
the temperature at which 50% of the paired DNA strands 
separate . 

The hybridization is preferably conducted in 
3 5 solutions of high ionic strength, such as, in particular, 
6 x SSC solutions. 

Advantageously, the hybridization conditions used 
are as follows: 
- temperature: 42 °C, 



- hybridization buffer: 6 x SSC, 5 x Denhart's, 0.1% SDS, 
as described in Example III. 

Advantageously, these probes are represented by 
the following oligonucleotides or the sequences 
5 complementary to them: 



SEQ 


ID 


No. 


20: 


GCG 


AGC 


TGC 


CCT 


CGG 


AG 




SEQ 


ID 


NO. 


21: 


GGT 


TCT 


GCA 


GGT 


GAC 


TCA 


G 


SEQ 


ID 


No. 


22: 


GCC 


ATG 


CCT 


GTC 


TAC 


AAG 




SEQ 


ID 


No. 


23 : 


ACC 


AGC 


TGG 


TTG 


ACG 


GAG 




SEQ 


ID 


No. 


24: 


GTC 


AAC 


CAG 


CTG 


GTG 


GGC 


CAG 


SEQ 


ID 


No. 


25: 


GTG 


GAT 


CTC 


GGC 


CTC 


C 




SEQ 


ID 


No. 


26: 


AGG 


CCG 


GCG 


TGG 


GGA 


AG 




SEQ 


ID 


No. 


27 : 


CTT 


GGC 


GAT 


CTG 


GCA 


GTA 


G 


SEQ 


ID 


No. 


28: 


GCG 


GCC 


ACG 


ACC 


GTG 


AC 




SEQ 


ID 


No. 


29: 


GGC 


AGC 


TTG 


GGT 


CTC 


TGG 




SEQ 


ID 


NO. 


30: 


CTG 


TAC 


GTC 


GGT 


GAC 


CCC 




SEQ 


ID 


NO. 


31: 


TCA 


GTG 


GAT 


CTC 


GGC 


CTC 




SEQ 


ID 


No. 


32: 


AGG 


GGA 


CGC 


AGC 


GAA 


ACC 




SEQ 


ID 


NO. 


33: 


CCA 


TCA 


GCT 


CCA 


GGC 


TCT 


C 


SEQ 


ID 


NO. 


34: 


CCA 


GGA 


CAG 


GCG 


CAG 


ATG 




SEQ 


ID 


NO. 


35: 


GAT 


GAG 


GTG 


GCT 


GGC 


TGG 


A 


SEQ 


ID 


NO. 


36: 


TGG 


TCA 


GGT 


TCT 


GCA 


GGT 


G 


SEQ 


ID 


NO. 


37: 


CAC 


CTA 


CTC 


CAG 


GGA 


TGC 




SEQ 


ID 


No. 


38: 


AGG 


AAA 


ATA 


GAA 


GCG 


TCA 


GTC 


SEQ 


ID 


No. 


39: 


CAG 


GCC 


CAC 


TTG 


CCT 


GCC 




SEQ 


ID 


No. 


40: 


CTG 


TCC 


CCA 


AGC 


TGA 


TGA 


G 



Preferably/ the probes of the invention are 
labelled prior to their use. To this end, several 
techniques are within the capacity of a person skilled in 

3 0 the art (fluorescent, radioactive, chemo luminescence, 
enzyme, and the like, labelling) . 

The in vitro diagnostic methods in which these 
nucleotide probes are employed are included in the 
subject of the present invention. 

3 5 These methods relate, for example, to the 

detection of abnormal syntheses (e.g. accumulation of 
transcription products) or of genetic abnormalities, such 
as loss of heterozygosity and genetic rearrangement, and 
point mutations in the nucleotide sequences of nucleic 
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acids coding for an SR-p70 protein, according to the 
definition given above. 

The nucleotide sequences of the invention are 
also useful for the manufacture and use of 
oligonucleotide primers for sequencing reactions or 
specific amplification reactions according to the so- 
called PCR technique or any variant of the latter (ligase 
chain reaction (LCR) , etc) . 

Preferred primer pairs consist of primers 
selected from the nucleotide sequences: SEQ ID No. 1: 
monkey sequence of 2,874 nucleotides, and SEQ ID No. 5: 
human SR-p7 0a cDNA, in particular upstream of the ATG 
translation initiation codon and downstream of the TGA 
translation stop codon. 

Advantageously, these primers are represented by 
the following pairs: 

- pair No. I t 

sans* primar t GCG AGC TGC CCT CGO AO (SSQ ID No . 20) 
antisansa primar: GGT TCT GCA GGT GAC TCA G (SEQ ID No. 21) 

- pair No. 2 ; 

sema primar: GCC ATG CCT GTC TAC AAG (SSQ ID No. 22) 
antisansa primer: ACC AGC TGG TTG ACG GAG (SSQ ID No. 23) 

- pair No. 3 t 

2 5 sans* primar t GTC AAC CAO CTG GTG GGC GAG (SSQ ID No. 2 4) 

antisansa primar: GTG GAT CTC GGC CTC C (SSQ ID No. 25) 

- pair No . 4 t 

sansa primar: AGG CCG GCG TGG GGA AG (SSQ ID No. 25) 

3 0 antisansa primar: CTT GGC GAT CTG GCA GTA G (SSQ ID No. 27) 

- Pair No. 5 t 

sansa primar: GCG GCC ACG ACC GTG A (SSQ ID No. 28) 
antisansa primar: GGC AGC TTG GGT CTC TGG (SSQ ID No. 29) 

- pair No. 6 : 

3 5 sansa primar: CTG TAC GTC GGT GAC CCC (SSQ ID No. 30) 

antisansa primar: TCA GTG GAT CTC GGC CTC (SSQ ID No. 31) 

- pair No. 7 : 

sansa primar: AGG GGA CGC AGC GAA ACC (SSQ ID No. 32) 
antisansa primar t GGC AGC TTG GGT CTC TGG (SSQ ID No. 29) 



10 



15 



- pair No. 8 ; 

sens* primer: CCCCCCCCCCCCCCN (where N equals G, A or T) 
antiaenae primer: CCA TCA GCT CCA GGC TCT C (SBQ ID No. 33) 



sense primer: CCCCCCCCCCCCCCN (where K equals G, A or T) 
anti sense primer: CCA GGA CAG GCG CAG ATG (SEQ ID No. 34) 

- pair No. 10 : 

sense primer: CCCCCCCCCCCCCCCN (where N equals G, A or T) 
antisense primer: CTT GGC GAT CTG GCA GTA G (SEQ ID No. 27) 

- pair No. li t 

sense primer: CAC CTA CTC CAG GGA TGC (SEQ ID No . 37) 
antisense primer: AGG AAA ATA GAA GCG TCA GTC (SEQ ID No. 38) 

- pair No. 12 t 

sense primer: CAG GCC CAC TTG CCT GCC (SEQ ID No. 39) 
antisense primer: CTG TCC CCA AGC TGA TGA G (SBQ ID No. 40) 

These primers correspond to the sequences 
extending, respectively: 

- from nucleotide No. 124 to nucleotide No. 140 
on SEQ ID No. 1 and from nucleotide No. 1 to 
nucleotide No. 17 on SEQ ID No. 5 for SEQ ID 
No. 2 0 

- from nucleotide No. 2280 to nucleotide No. 2262 
on SEQ ID No. 1 and from nucleotide No. 2156 to 
nucleotide 2138 on SEQ ID No. 5 for SEQ ID No. 
21 

- from nucleotide No. 684 to nucleotide No. 7 01 
on SEQ ID No. 1 for SEQ ID No. 22 

- from nucleotide No. 1447 to nucleotide No. 1430 
on SEQ ID No. 1 and from nucleotide 1324 to 
nucleotide 1307 on SEQ ID No. 5 for SEQ ID No. 
23 

- from nucleotide 1434 to nucleotide 1454 on SEQ 
ID No. 1 and from nucleotide 1311 to nucleotide 
1331 on SEQ ID No. 5 for SEQ ID No. 24 

- from nucleotide 2066 to nucleotide 2051 on SEQ 
ID No. 1 and from nucleotide 1940 to nucleotide 
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192 5 on SEQ ID No. 5 for SEQ ID No. 25 

- from nucleotide 16 to nucleotide 3 2 on SEQ ID 
No. 5 for SEQ ID No. 26 

- from nucleotide 503 to nucleotide 485 on SEQ ID 
No. 5 for SEQ ID No. 27 

- from nucleotide 160 to nucleotide 176 on SEQ ID 
No. 11 for SEQ ID No. 28 

- from nucleotide 1993 to nucleotide 1976 on SEQ 
ID NO. 5 for SEQ ID No. 29 

- from nucleotide 2 63 to nucleotide 280 on SEQ ID 
No. 11 for SEQ ID No. 30 

- from nucleotide 1943 to nucleotide 1926 on SEQ 
ID No. 5 for SEQ ID No. 31 

- from nucleotide 128 to nucleotide 145 on the 
nucleotide sequence depicted in Figure 22 for 
SEQ ID No. 32 

- from nucleotide 1167 to nucleotide 1149 on SEQ 
ID No. 5 for SEQ ID No. 33 

- from nucleotide 928 to nucleotide 911 on SEQ ID 
No. 5 for SEQ ID No. 34 

- from nucleotide 677 to nucleotide 659 on SEQ ID 
No. 5 for SEQ ID No. 3 5 

- from nucleotide 1605 to nucleotide 1587 on SEQ 
ID No. 5 for SEQ ID No. 36 

- from nucleotide 1 to nucleotide 18 on the 
nucleotide sequence depicted in Figure 13 for 
SEQ ID No. 37 

- from nucleotide 833 to nucleotide 813 on the 
nucleotide sequence depicted in Figure 13 for 
SEQ ID No. 3 8 

- from nucleotide 25 to nucleotide 42 on the 
nucleotide sequence depicted in Figure 13 for 
SEQ ID No. 3 9 

- from nucleotide 506 to nucleotide 488 on the 
nucleotide sequence depicted in Figure 13 for 
SEQ ID No. 40 

The nucleotide sequences according to the 
invention can have, moreover, uses in gene therapy, in 
particular for controlling the phenomena of apoptosis and 
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of reversion, of transformation. 

The nucleotide sequences according to the 
invention may, moreover, be used for the production of 
recombinant SR-p7 0 proteins, according to the definition 
5 which has been given to this term. 

These proteins may be produced from the 
nucleotide sequences defined above, according to 
techniques of production of recombinant products which 
are known to a person skilled in the art. In this case, 
10 the nucleotide sequence used is placed under the control 
of signals permitting its expression in a cell host. 

An effective system for production of a 
recombinant protein necessitates having at one's disposal 
a vector, for example of plasmid or viral origin, and a 
15 compatible host cell. 

The cell host may be selected from prokaryotic 
systems such as bacteria, or eukaryotic systems such as, 
for example, yeasts, insect cells, CHO cells (Chinese 
hamster ovary cells) or any other system advantageously 
20 available. A preferred cell host for the expression of 
proteins of the invention consists of the S. coli 
bacterium, in particular the strain MC 1061 (Clontec) . 

The vector must contain a promoter, translation 
initiation and termination signals and also the 
25 appropriate transcription regulation regions. It must be 
capable of being maintained stably in the cell and can, 
where appropriate, possess particular signals specifying 
the secretion of the translated protein. 

These various control signals are selected in 
3 0 accordance with the cell host used. To this end, the 
nucleotide sequences according to the invention may be 
inserted into vectors which are autonomously replicating 
within the selected host, or vectors which are 
integrative for the chosen host. Such vectors will be 
3 5 prepared according to methods commonly used by a person 
skilled in the art, and the clones resulting therefrom 
may be introduced into a suitable host by standard 
methods such as, for example, electroporation. 

The cloning and/or expression vectors containing 
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at least one of the nucleotide sequences defined above 
also form part of the present invention. 

A preferred cloning and expression vector is the 
plasmid pSEl, which contains the elements necessary for 
5 its use both as a cloning vector in E. coli (origin of 
replication in E. coli and ampicillin resistance gene 
originating from the plasmid pTZ 18R) and as an expres- 
sion vector in animal cells (promoter, intron, 
polyadenylation site, origin of replication of the SV40 
10 virus) , as well as the elsments enabling it to be copied 
as a single strand with the object of sequencing (origin 
of replication of phage fl) . 

The characteristics of this plasmid are described 
in Application EP 0,506,574. 
15 its construction and also the integration of the 

cDNAs originating from the nucleic acid sequences of the 
invention are, moreover, described in the examples below. 

According to a preferred embodiment, the proteins 
of the invention are in the form of fusion proteins, in 
2 0 particular in the form of a protein fused with 
glutathione S- transferase (GST) . A designated expression 
vector in this case is represented by the plasmid vector 
pGEX-4T-3 (Pharmacia ref -27 .4583) . 

The invention relates, in addition, to the host 

2 5 cells transfected by these aforementioned vectors. These 

cells may be obtained by introducing into host cells a 
nucleotide sequence inserted into a vector as defined 
above, followed by culturing of the said cells under 
conditions permitting the replication and/or expression 

3 0 of the transfected nucleotide sequence. 

These cells are usable in a method of production 
of a recombinant polypeptide of sequence SEQ ID No. 2, 
SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, 
SEQ ID No. 12, SEQ ID No. 14, SEQ ID No. 16 or SEQ ID No. 
35 18 or any biologically active fragment or derivative of 
the latter. 

The method of production of a polypeptide of the 
invention in recombinant form is itself included in the 
present invention, and is characterized in that the 
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transf ected cells are cultured under conditions 
permitting the expression of a recombinant polypeptide of 
sequence SEQ ID No. 2, SEQ ID No. 4, SEQ ID No . 6, SEQ ID 
No. 8, SEQ ID No. 10, SEQ ID No. 12, SEQ ID No. 14, SEQ 
5 ID No. 16 or SEQ ID No. 18 or of any biologically active 
fragment or derivative of the latter, and in that the 
said recombinant polypeptide is recovered. 

The purification methods used are known to a 
person skilled in the art. The recombinant polypeptide 

10 may be purified from lysates and cell extracts or from 
the culture medium supernatant, by methods used 
individually or in combination, such as fractionation, 
chromatographic methods, immunoaf f inity techniques using 
specific mono- or polyclonal antibodies, and the like. A 

15 preferred variant consists in producing a recombinant 
polypeptide fused to a "carrier" protein (chimeric 
protein) . The advantage of this system is that it permits 
a stabilization and a decrease in proteolysis of the 
recombinant product, an increase in solubility during in 

2 0 vitro renaturation and/or a simplification of the 
purification when the fusion partner possesses an 
affinity for a specific ligand. 

Advantageously, the polypeptides of the invention 
are fused with glutathione S- transf erase at the N- 

2 5 terminal position (Pharmacia "GST" system) . The fusion 

product is, in this case, detected and quantified by 
means of the enzyme activity of the GST. The colorimetric 
reagent used is a glutathione acceptor, a substrate for 
GST. The recombinant product is purified on a 

3 0 chromatographic support to which glutathione molecules 

have been coupled beforehand. 

The mono- or polyclonal antibodies capable of 
specifically recognizing an SR-p70 protein according to 
the definition given above also form part of the 
35 invention. Polyclonal antibodies may be obtained from the 
serum of an animal immunized against protein, produced, 
for example, by genetic recombination according to the 
method described above, according to standard procedures. 

The monoclonal antibodies may be obtained 
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according to the traditional hybridoma culture method 
described by Kohler and Milstein, Nature, 1975, 256, 495- 
497 . 

Advantageous antibodies are antibodies directed 
5 against the central region lying between residue 110 and 
residue 310 for the sequences SEQ ID No. 2 or 6, or 
between residue 60 and residue 260 for the sequence SEQ 
ID No. 8. 

The antibodies according to the invention are, 
10 for example, chimeric antibodies, humanized antibodies or 
Fab and P(ab') 2 fragments. They may also take the form of 
immuno conjugates or labelled antibodies. 

Moreover, besides their use for the purification 
of the recombinant polypeptides, the antibodies of the 
15 invention, especially the monoclonal antibodies, may also 
be used for detecting these polypeptides in a biological 
sample . 

Thus they constitute a means of 
immunocytochemical or immunohistochemical analysis of the 
20 expression of SR-p70 proteins on sections of specific 
tissues, for example by immunofluorescence, gold 
labelling or enzyme immunoconjugates . 

They make it possible, in particular, to 
demonstrate an abnormal accumulation of SR-p7 0 proteins 

2 5 in certain tissues or biological samples, which makes 

them useful for detecting cancers or monitoring the 
progression or remission of pre-existing cancers. 

More generally, the antibodies of the invention 
may be advantageously employed in any situation where the 
30 expression of an SR-p70 protein has to be observed. 

Hence the invention also relates to a method of 
in vitro diagnosis of pathologies correlated with an 
expression or an abnormal accumulation of SR-p70 pro- 
teins, in particular the phenomena of carcinogenesis, 

3 5 from a biological sample, characterized in that at least 

one antibody of the invention is brought into contact 
with the said biological sample under conditions 
permitting the possible formation of specific immuno- 
logical complexes between an SR-p70 protein and the said 
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antibody or antibodies, and in that the specific immuno- 
logical complexes possibly formed are detected. 

The invention also relates to a kit for the in 
vitro diagnosis of an abnormal expression or 
5 accumulation of SR-p70 proteins in a biological sample 
and/or for measuring the level of expression of this 
protein in the said sample, comprising: 

- at least one antibody specific for an SR-p70 
protein, optionally bound to a support, 
10 - means of visualization of the formation of 

specific antigen- antibody complexes between an 
SR-p70 protein and the said antibody, and/or 
means of quantification of these complexes. 
The invention also relates to a method of early 
15 diagnosis of tumour formation, by detecting 
autoantibodies directed against an SR-p70 protein in an 
individual's serum. 

Such a method of early diagnosis is characterized 
in that a serum sample drawn from an individual is 
2 0 brought into contact with a polypeptide of the invention, 
optionally bound to a support, under conditions 
permitting the formation of specific immunological 
complexes between the said polypeptide and the 
autoantibodies possibly present in the serum sample, and 
2 5 in that the specific immunological complexes possibly 
formed ars detected. 

A subject of the invention is also a method of 
determination of an allelic variability, a mutation, a 
deletion, an insertion, a loss of heterozygosity or a 
30 genetic abnormality of the SR-p70 gene which may be 
involved in pathologies, characterized in that it 
utilizes at least one nucleotide sequence described 
above. Among the methods of determination of an allelic 
variability, a mutation, a deletion, an insertion, a loss 
35 of heterozygosity or a genetic abnormality of the SR-p70 
gene, preference is given to the method which is 
characterized in that it comprises at least one step of 
PCR amplification of the target nucleic acid sequence of 
SR-p7 0 liable to exhibit a polymorphism, a mutation, a 
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deletion or an insertion, using a pair of primers of 
nucleotide sequences defined above, a step during which 
the amplified products are treated using a suitable 
restriction enzyme and a step during which at least one 
5 of the products of the enzyme reaction is detected or 
assayed. 

The invention also comprises pharmaceutical 
compositions comprising as active principle a polypeptide 
corresponding to the above definitions, preferably in 
10 soluble form, in combination with a pharmaceutical ly 
acceptable vehicle. 

Such compositions afford a novel approach to 
treating the phenomena of carcinogenesis at the level of 
the control of multiplication and cell differentiation. 
15 Preferably, these compositions can be 

administered systemically, preferably intravenously, 
intramuscularly, intradermally or orally. 

Their optimal modes of administration, dosages 
and pharmaceutical dosage forms may be determined 

2 0 according to the criteria generally borne in mind in 

establishing a therapeutic treatment suitable for a 
patient, such as, for example, the patient's age or body 
weight, the severity of his or her general state, the 
tolerability of treatment and the observed side effects, 
25 and the like. 

Lastly, the invention comprises a method of gene 
therapy, in which nucleotide sequences coding for an SR- 
p70 protein are transferred to target cells by means of 
inactivated viral vectors. 

3 0 Other features and advantages of the invention 

are to be found in the remainder of the description, with 
the examples and the figures for which the legends are 
given below. 

LEGEND TO THE FIGURES 
35 Figure 1: Nucleic acid comparison of monkey SR-p70a 
cDNA (corresponding to SEQ ID No. 1) with 
the nucleic acid sequence of monkey p53 
cDNA. 
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Figure 2: Protein comparison of monkey SR-p70a with 
monkey p53 protein (sw: p53-cerae) . 

Figure 3 : Comparison of the nucleic acid sequence of 
monkey SR-p70a and b cDNA (corresponding, 
respectively, to SEQ ID No . 1 and SEQ ID 
No. 3) . 

Figure 4: Nucleic acid sequence and deduced protein 
sequence of monkey SR-p70a. 

Figure 5: Partial nucleic acid sequence and complete 
deduced protein sequence of monkey SR-p7 0b. 

Figure 6 : Partial nucleic acid sequence and deduced 
complete protein sequence of human SR-p70a 
(corresponding to SEQ ID No. 5) . 

Figure 7 : Partial nucleic acid sequence and complete 
deduced protein sequence of mouse SR-p70c 
(corresponding to SEQ ID No. 7) . 

Figure 8 : Partial nucleic acid sequence and partially 
deduced protein sequence of mouse SR-p70a 
(corresponding to SEQ ID No. 9) . 

Figure 9: Multialignment of the proteins deduced from 
monkey (a and b) , human (a) and mouse (a and 
c) SR-p70 cDNAs * 

Figure 10a: Immunoblot of the SR-p70 protein. 

Figure 10b: Detection of the endogenous SR-p70 protein. 

Figure 11: Chromosomal localization of the human SR-p7 0 
gene. The signal appears on chromosome 1, in 
the p3 6 region. 

Figure 12: Genomic structure of the SR-p70 gene and 



comparison with that of the p53 gene. The 
human protein sequences of SR-p70a (upper 
line of the alignment) and of p53 (lower 
line) are divided up into peptides on the 
5 basis of the respective exons from which 

they are encoded. The figures beside the 
arrows correspond to the numbering of the 
corresponding exons. 

Human genomic sequence of SR-p70 from the 3' 
end of intron 1 to the 5' end of exon 3. The 
introns are boxed. At positions 123 and 133, 
two variable nucleic acid positions are 
localized (G •» A at 123 and C ■* T at 133) . 
The restriction sites for the enzyme Styl 
are underlined (position 130 in the case 
where a T is present instead of a C at 
position 133, position 542 and position 
610) . The arrows indicate the positions of 
the nucleic acid primers used in Example XI. 

20 Figure 14: Nucleic acid comparison of the 5' region of 
the human cDNAs of SR-p70d and of SR-p7 0a. 

Figure 15: Multialignment of the nucleic acid sequences 
corresponding to human SR-p7 0a, b, d, e, and 
f . 

25 Figure 16: Multialignment of the proteins deduced from 
human SR-p70 (a, b, d, e and f) cDNAs. 

Figure 17 : Partial nucleic acid sequence and partial 
deduced protein sequence of human SR-p70a. 
The two bases in bold characters correspond 
3 0 to two variable positions (see Figure 6) . 

This sequence possesses a more complete non- 
coding 5' region than the one presented in 
Figure 6 . 



Figure 13 : 
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Figure 18: Analysis of the SR-p70a transcripts after 
PCR amplification. 

lane M: 1 kb ladder (GIBCO-BRL) molecular 

weight markers 

lane 1: line HT2 9 

lane 3: line SK-N-AS 

lane 5: line UMR-32 

lane 7: line U-373 MG 

lane 9 : line SW 480 

lane 11: line CHP 212 

lane 13: line SK-N-MC 

lanes 2, 4, 6, 8, 10, 12, 14: negative 
controls corresponding to lanes 1, 3, 5, 7, 
9, 11 and 13, respectively (absence of 
inverse transcriptase in the RT-PCR 
reaction) . 



Figure 19: 



Analysis by agarose gel 
electrophoresis of genomic fragments 
amplified by PCR (from the 3' end of 
intron 1 to the 5' end of exon 3) . 
The numbering of the lanes 
corresponds to the numbering of the 
control population. Lane M: mole- 
cular weight markers (1 kb ladder) . 



Analysis identical to that of part 
A, after digestion of the same 
samples with the restriction enzyme 
Styl. 



Figure 20: Diagrammatic representation with a partial 
restriction map of the plasmid pCDNA3 
containing human SR-p70a. 



EXAMPLE I 
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Cloning of SR-p70 cDNA from COS-3 cells 



1. Culturing of COS-3 cells 

COS-3 cells (African green monkey kidney cells 
5 transformed with the SV 40 virus T antigen) are cultured 
in DMEM medium (GIBCO-BRL reference 41 965-047) 
containing 2 mM L-glutamine and supplemented with 50 mg/1 
of gentamicin and 5% of foetal bovine serum (GIBCO-BRL 
reference 10231-074) to semi-conf luence . 



2. Preparation of the messenger RNA 

a) Extraction of the messenger RNA 

The cells are recovered in the following manner: 

the adherent cells are washed twice with PBS 
buffer (phosphate buffered saline, reference 
04104040-GIBCO-BRL) , then scraped off with a 
rubber scraper and centrifuged. 
The cell pellet is suspended in the lysis buffer of 
the following composition: 4 M guanidine 
thiocyanate; 25 mM sodium citrate pH 7; 0.5% 
sarcosyl; 0.1 M /S-mercaptoethanol . The suspension is 
sonicated using an Ultra-Turrax No. 231256 sonicator 
(Janice and Kundel) at maximum power for one minute. 
Sodium acetate pH 4 is added to a concentration of 
0.2 M. The solution is extracted with one volume or 
a phenol/chloroform (5/1 v/v) mixture. The RNA 
contained in the aqueous phase is precipitated at 
-20 °C using one volume of isopropanol. The pellet is 
resuspended in the lysis buffer. The solution is 
extracted again with a phenol/chloroform mixture and 
the RNA is precipitated with isopropanol. After 
washing of the pellet with 7 0% and then 10 0% 
ethanol, the RNA is resuspended in water. 

b) Purification of the poly (A)*" fraction of the RNA 
Purification of poly (A) * fraction of the RNA is 
carried out using the DYNAL Dynabeads oligo(dT) 25 
kit (reference 610.05) according to the protocol 



recommended by the manufacturer. The principle is 
based on the use of superparamagnetic polystyrene 
beads to which an oligonucleotide poly(dT) 25 is 
attached. The poly (A) + fraction of the RNA is 
hybridized with the oligo(dT) 25 coupled to the 
beads, which are trapped on a magnetic support. 

Production of the complementary DNA library 

Preparation of the complementary DNA 

From 0.5 /*g of the poly (A) * RNA from COS - 3 cells 

obtained at the end of step 2, the [ 32 P]dCTP- 

labelled single- stranded complementary DNA is 

prepared (the complementary DNA obtained possesses 

a specific activity of 3000 dpm/ng) with the 

synthetic primer of the following sequence 

(comprising a BamHI site) : 

5 ' <GATCCGGGCC CTTTTTTTTT TTT<3 / 

in a volume of 30 ^1 of buffer of composition: 50 mM 
Tris-HCl pH 8.3, 6 mM MgCl 2 , 10 mM DDT, 40 mM KC1, 
containing 0.5 mM each of the deoxynucleotide 
triphosphates, 30 jiCi of [«- 32 P]dCTP and 30 U of 
RNasin (Promega) . After one hour of incubation at 
37 °C, then 10 minutes at 50 °C, then 10 minutes again 
at 37 °C, with 200 units of the enzyme reverse trans- 
criptase RNase H" (GIBCO-BRL reference 8064A) , 4 fil 
of EDTA are added. 

Alkaline hydrolysis of the RNA template 

6 nl of 2N NaOH solution are added and the mixture 

is then incubated for 5 minutes at 65 °C. 

Purification on a Sephacryl S-400 column 

In order to remove the synthetic primer, the 

complementary DNA is purified on a column of 1 ml of 

Sephacryl S-400 (Pharmacia) equilibrated in TE 

buffer. 

The first two radioactive fractions are pooled and 
precipitated with 1/10 volume of 10 M ammonium 
acetate solution and 2.5 volumes of ethanol, this 
being done after extraction with one volume of 
chloroform. 
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Homopolymer addition of dG 

The complementary DNA is elongated at the 3' end 
with a dG tail with 20 units of the enzyme terminal 
transferase (Pharmacia 27073001) . The mixture is 
incubated in 20 ptl of buffer of composition: 30 mM 
Tris-HCl pH 7.6, 1 mM cobalt chloride, 140 mM 
cacodylic acid, 0.1 mM DTT, 1 mM dGTP , for 15 
minutes at 37 °C, and 2 jtl of 0.5 M EDTA are then 
added. 

Steps b) and c) are repeated again 

Pairing of the cloning vector pSEl (EP 506,574) and 
the complementary DNA in the presence of the 
adaptor. 

The mixture is centrifuged, the pellet is dissolved 
in 33 }i1 of TE buffer, 5 pi (125 ng) of cloning 
vector pSEl, 1 nl (120 ng) of the adaptor of the 
following sequence (comprising an Apal site) : 
5 ' AAAAAAAAAAAAAGGGCCCG3 ' 

and 10 y.1 of 200 mM NaCl solution are added, and the 
reaction mixture is incubated for 5 minutes at 65 °C 
and then allowed to cool to room temperature. 
Ligation 

The cloning vector and the single- stranded cDNA are 
ligated in a volume of 100 /il with 32.5 units of the 
enzyme phage T4 DNA ligase (Pharmacia reference 27 0 
87002) overnight at 15°C in a buffer of composition: 
50 mM Tris-HCl pH 7.5, 10 mM MgCl 2 , 1 mM ATP . 
Synthesis of the second strand of the cDNA 
The proteins are removed by phenol extraction 
followed by chloroform extraction, and 1/10 volume 
of 10 mM ammonium acetate solution and then 2.5 
volumes of ethanol are then added. The mixture is 
centrifuged, the pellet is dissolved in a buffer of 
composition 33 mM Tris-acetate pH 7.9, 62.5 mM 
potassium acetate, 1 mM magnesium acetate and 1 mM 
dithiothreitol (DTT) , and the second strand of 
complementary DNA is synthesized in a volume of 
3 0 pLl with 3 0 units of the enzyme phage T4 DNA 
polymerase (Pharmacia reference 270718) and a 



mixture of 1 mM the four deoxynucleotide 
triphosphates dATP, dCTP, dGTP and dTTP as well as 
two units of phage T4 gene 32 protein (Pharmacia 
reference 27-0213) for one hour at 37 °C. The mixture 
is extracted with phenol and the traces of phenol 
are removed with a column of polyacrylamide P10 
(Biogel P10-200-400 mesh - reference 15011050 - 
Biorad) . 

Transformation by electroporation 

E. coli MC 10 61 cells are transformed with the 
recombinant DNA obtained above by electroporation 
using a Biorad Gene Pulser apparatus (Biorad) used 
at 2.5 kV under the conditions specified by the 
manufacturer, and the bacteria are then grown for 
one hour in the medium known as LB medium (Sambrook 
op. cit. ) of composition: bactotryptone 10 g/1; 
yeast extract 5 g/1; NaCl 10 g/1. 

The number of independent clones is determined by 
plating out a 1/1000 dilution of the transformation 
after the first hour of incubation on a dish of LB* 
medium with the addition of 1.5% of agar (w/v) and 
100 /xg/ml of ampicillin, hereinafter referred to as 
LB agar medium. The number of independent clones is 
1 million. 

Analysis of the cDNAs of the library 
In the context of the analysis of individual clones 
of the library by nucleic acid sequencing of the 5' 
region of the cDNAs , one clone, designated SR-p7 0a, 
was shown to exhibit a partial homology with the 
cDNA of the already known protein, the p53 protein 

(Genbank X 02469 and X 16384) (Figure 1) , The 
sequences were produced with the United States 
Biochemical kit (reference 70770) and/or the Applied 
Biosy st ems kit (references 401434 and/or 401628) , 
which use the method of Sanger et al., Proc. Natl. 
Acad. Sci. USA; 1977, 14, 5463-5467. The plasmid DNA 
is prepared from the WIZARD minipreparation kit 

(Promega reference A7510) . The primers used are 16- 
to 22-mer oligonucleotides, complementary either to 
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the vector pSEl in the region immediately at the 5' 
end of the cDNA, or to the sequence of the cDNA . 
A second cDNA was isolated from the same library by 
screening, in a manner similar to the technique 
5 described in EXAMPLE III. 3) below, with a fragment 

of SR-p70a the DNA labelled with 32 P with the BRL 
"Random Primers DNA labelling systems" kit 
{reference 18187-013) . The hybridization and washing 
buffers are treated by adding 50% of formamide. The 
10 last wash is carried out in 0.1 x SSC/0.1% SDS at 

60 °C. This second sequence (SR-p70b cDNA) is 
identical to the first but an internal fragment has 
been deleted from it (Figure 3) . 

The two SR-p70 cDNAs , of length 2 874 nucleotides 
15 <SR-p70a) and 2780 nucleotides (SR-p70b) , correspond 

to the products of a single gene, an alternative 
splicing bringing about a deletion of 94 bases 
between nucleotides 1637 and 1732 and a premature 
termination of the corresponding encoded protein. 

2 0 The proteins deduced from the two cDNAs possess 637 

amino acids and 499 amino acids, respectively 
(Figures 4 and 5) . 

EXAMPLE II 

Obtaining of the sequence and cloning of the cDNA of the 
25 SR-p70a protein from HT-29 (human colon adenocarcinoma) 
cells 

1) Culturing of HT-29 cells 

The cells are cultured in McCoy's 5 medium (GIBCO 
26600-023) with the addition of 10% of foetal calf serum 
30 (GIBCO 10081-23) and 50 mg/1 of gentamicin, to semi- 
confluence . 

2) Preparation of the complementary DNA 

The messenger RNA is prepared as described in 
EXAMPLE 1.2. The cDNA is prepared in a manner similar to 

3 5 that described in EXAMPLE 1.3, with 5 of total 

messenger RNA, using a poly(T) 12 primer. The reaction is 
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not interrupted with EDTA. 



3) Specific amplification of the human cDNA by the so- 
called PCR technique 

The polymerization is carried out with 4 fil of 
5 cDNA in 5 0 pi final with the buffer of the following 
composition: 10 mM Tris-HCl pH 8.3, 2.5 nM MgCl 2 , 50 mM 
KC1 in the presence of 10% DMSO, 0 . 5 mM dNTP, 4 y.q/m± of 
each of the two nucleic acid primers and 2.5 units of TAQ 
DNA polymerase (Boehringer) . The primer pairs were 
10 selected on the basis of the nucleic acid sequence of the 
COS-3 SR-p70 clone, in particular upstream of the 
translation initiation ATG and downstream of the 
translation stop TGA, and are of the following 
compositions : 

15 sense primer: ACT GGT ACC GCG AGC TGC CCT CGG AG 
Kpn I restriction site 

antisense primer: GAC TCT AGA GGT TCT GCA GGT GAC TCA G 
Xba I restriction site 

The reaction is carried out for 30 cycles of 
20 94°C/1 minute, 54-60°C/l minute 30 seconds and 72°C/ 
1 minute 30 seconds, followed by a final cycle of 
72°C/6 minutes. 

4 ) Obtaining of the sequence of the human cDNA 

In a first step, the PCR product is removed from 
25 the oligonucleotides on a column of Sephacryl S-400, and 
then desalted by exclusion chromatography on a column of 
polyacrylamide P10 (Biorad reference 1504144) . The 
sequencing reactions are carried out using the Applied 
Biosystems kit (reference 401628) with oligonucleotides 
3 0 specific for the cDNA. The sequence obtained is very 
similar to that of monkey SR-p70a, and the deduced 
protein contains 636 amino acids (Figure 6) . 

In a similar manner, other sequences originating 
from human lines or tissues were obtained for the coding 
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portion of human SR-p7 0, in particular from the lung or 
pancreas. The proteins deduced from these sequences are 
identical to those obtained for the HT-29 line. 

5) Cloning of the human cDNA into plasmid pCDNA3 
5 (Invitrogen V 790-20) 

The PCR product obtained in 3) and also the 
plasmid are digested with the two restriction enzymes Kpn 
I and Xba I and then purified after migration on a 1% 
agarose gel using the Geneclean kit (Bio 101 reference 
10 3105) . After ligation with 100 ng of insert and 10 ng of 
vector and transformation (technique described in EXAMPLE 
I.3.g and i) , the recombinant clones are verified by 
sequencing using the Applied Biosystems kit mentioned 
above . 

15 EXAMPLE III 

Cloning of mouse SR-p70 cDNA from AtT-20 (pituitary 
tumour) cells 

1) Cell culturing of the line AtT-20 

The cells are cultured in Ham F10 medium (GIBCO 
20 31550-023) with the addition of 15% of horse serum (GIBCO 
26050-047), 2.5% of foetal calf serum (GIBCO 10081-073) 
and 5 0 mg/1 of gentamicin, to semi-conf luence . 

2) Preparation of the complementary DNA library 

The library is produced as described in EXAMPLE 

2 5 I. 2 and 3 from the cells cultured above. 

3) Screening of the library 

a) Preparation of the membranes 

The clones of the library are plated out on LB 
agar medium (Petri dishes 150 mm in diameter) coated with 

3 0 Biodyne A membranes (PALL reference BNNG 132) . After one 

night at 37 *C, the clones are transferred by contact onto 
fresh membranes. The latter are treated by depositing 
them on 3 mm Whatman paper soaked with the following 
solutions: 0.5 N NaOH, 1.5 M NaCl for 5 minutes, then 



- 27 - 

0.5 M Tris-HCl pH 8, 1.5 M NaCl for 5 minutes. After 
treatment with proteinase K in the following buffer: 
10 mM Tris-HCl pH 8, 10 mM EDTA, 50 mM NaCl, 0.1% SDS, 
100 Atg/ml proteinase K, for one hour at room temperature, 
5 the membranes are washed copiously in 2 x SSC (sodium 
citrate, NaCl) , dried and then incubated in an oven under 
vacuum at 80 °C for 20 minutes. 

b) Preparation of the probe 

On the basis of monkey and human SR-p7 0 cDNA 
10 sequences, a first sequence was produced on a fragment 
amplified from line AtT-20 mRNA as described in EXAMPLE 
II. 3 and 4, with the oligomers of the following 
compositions : 

sense primer: GCC ATG CCT GTC TAC AAG 

15 antisense primer: ACC AGC TGG TTG ACG GAG. 

On the basis of this sequence, an oligomeric 
probe specific for mouse was chosen and possesses the 
following composition: 
GAG CAT GTG ACC GAC ATT G. 
2 0 100 ng of the probe are labelled at the 3' end 

with 10 units of terminal transferase (Pharmacia) and 
100 fiCi of [a- 32 P]dCTP 3000 Ci/mmol (Amersham reference 
PB 10205) in 10 pi of the following buffer: 30 mM Tris- 
HCl pH 7.6, 140 mM cacodylic acid, 1 mM CoCl 2 , 0 . 1 mM DTT 

2 5 for 15 minutes at 37 °C. The radiolabelled nucleotides not 

incorporated are removed on a column of polyacrylamide 
P10 (Biorad, reference 1504144) . The probe obtained has 
a specific activity of approximately 5 x 10* dpm/jAg. 

c) Prehybridization and hybridization 

3 0 The membranes prepared in a) are prehybridized 

for 3 0 minutes at 42 °C in 6 x SSC, 5 x Denhart's, 0.1% 
SDS, and then hybridized for a few hours in the same 
buffer with the addition of the probe prepared in b) in 
the proportion of 10 s dpm/ml . 
3 5 d) Washing and exposure of the membranes 

The membranes are washed twice at room 
temperature in 2 x SSC/0.1% SDS buffer and then for one 
hour at 56 °C in 6 x SSC/0.1% SDS. The hybridized clones 
are visualized with KODAK XOMAT films. A positive clone 
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containing the mouse SR-p70 is selected and hereinafter 
designated as SR-p70c. 

4) Sequencing of mouse SR-p70 and analysis of the 
sequence 

The sequence is obtained using the Applied 
Biosystem kit (reference 401628) . The protein sequence 
deduced from mouse SR-p7 0c cDNA (Figure 7) exhibits a 
very strong homology with the human and monkey sequences, 
except in the N- terminal portion which diverges strongly 
(see Figure 9) . Using the so-called PCR technique in a 
similar manner to that described in EXAMPLE II. 3 and 4, 
a second 5' sequence (originating from the same AtT-2 0 
library) was obtained (Figure 8) . The deduced N- terminal 
protein sequence (sequence designated SR-p70a) is very 
similar to that deduced from human and monkey SR-p70 
cDNAs (SR-p70a) (Figure 9) . The line AtT-20 hence affords 
at least two SR-p70 transcripts. The latter 2 diverge in 
the N- terminal portion through different splicings . 

EXAMPLE IV 

20 1) Production of recombinant SR-p70 protein in S. coli 
a) Construction of the expression plasmid 

This consists in placing the COOH- terminal 
portion of the monkey SR-p70a protein, from the valine at 
position 427 to the COOH- terminal histidine at position 

25 637, in fusion with the glutathione S- transferase (GST) 
of the plasmid vector pGEX-4T-3 (Pharmacia reference 27- 
4583) . For this purpose, the corresponding insert of SR- 
p70a (position 1434 to 2066) was amplified by PCR with 
10 ng of plasmid containing monkey SR-p70a cDNA. The 

30 nucleic acid primers are of the following composition: 

sense primer: TTT GGA TCC GTC AAC CAG CTG GTG GGC CAG 
BamHI restriction site 



10 



.tisense primer: AAA GTC GAC GTG GAT CTC GGC CTC C. 
Sal I site 
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The fragment obtained and also the vector are 
digested with the restriction enzymes BamHI and Sal I and 
cloning is carried out as described in EXAMPLE II. 5. The 
selected clone is referred to as pG SR-p70. 
5 b) Expression and purification of the GST-pSR-p70 fusion 
protein 

This step was carried out using the "bulk GST 
purification module" kit (Pharmacia Reference 27-4570- 
01) . 

10 In outline, the recombinant clone was cultured at 

37 °C in one litre of 2 x YTA medium + 100 ^g/ml 
ampicillin. At OD 0.8, expression is induced with 0 . 5 mM 
IPTG for 2 hours at 37 °C. After centrifugation, the cell 
pellet is taken up in cold PBS and then sonicated by 

15 ultrasound. After the addition of 1% Triton X-100, the 
preparation is incubated for 30 minutes with agitation at 
room temperature. After centrifugation at 12,000 g for 
10 minutes at 4°C, the supernatant is recovered. 
Purification is then carried out on a glutathione- 

2 0 Sepharose 4B affinity chromatography column. Binding and 

washing are carried out in PBS buffer and elution is 
carried out by competition with reduced glutathione. The 
final concentration is brought to 3 00 pg/ml of fusion 
protein. 

25 2) Production of 3R-p70a. protein in COS-3 cells 

COS -3 cells are tranaf acted with pSEl plasmid DNA 
into which monkey SR-p70a cDNA has been cloned (EXAMPLE 
1.1), or with the vector pSEl plasmid DNA as control, by 
the DEAE-dextran technique: the COS-3 cells are 

3 0 inoculated at 5 x 10 s cells per 6 cm dish in culture 

medium containing 5% of foetal bovine serum (EXAMPLE 
1.1). After culture, the cells are rinsed with PBS. 1 ml 
of the following mixture is added: medium containing 
6.5 fig of DNA, 250 ng/ml of DEAE-dextran and 100 
3 5 chloroquine. The cells are incubated at 37 8 C in 5% C0 2 
for 4 to 5 hours. The medium is aspirated off, 2 ml of 
PBS containing 10% of DMSO are added and the cells are 
incubated for one minute, shaking the dishes gently. The 
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medium is aspirated off again and the cells are rinsed 
twice with PBS. The cells are then incubated at 37 °C with 
medium containing 2% of foetal bovine serum for the 
period during which expression takes place, which is 
5 generally 3 days. 

The SR-p70a protein is then analysed as described 
in EXAMPLE IV by immunob lot ting . 

EXAMPLE V 

Preparation of specific antibodies 

10 150 (ig of proteins of the sample prepared 

according to EXAMPLE IV were used to immunize a rabbit 
(New Zealand male weighing 1.5 to 2 kg approximately). 
The immunizations were performed every 15 days according 
to the protocol described by Vaitukaitis, Methods in 

15 Enzymology, 1981, 73, 46. At the first injection, one 
volume of antigenic solution is emulsified with one 
volume of Freund' s complete adjuvant (Sigma reference 
4258). Five boosters were administered in Freund' s 
incomplete adjuvant (Sigma reference 5506) . 

2 0 EXAMPLE VI 

Detection of the SR-p7 0 protein: Western immunoblotting 
1) Materials used for immunoblotting 
a) Cell lines used for immunoblotting 

The following cell lines were cultured as 

25 described in the catalogue "Catalogue of cell lines and 
hybridomas, 7th edition, 1992" of the ATCC (American Type 
Culture Collection) : COS-3, CV-1 (monkey kidney cell 
line), HT-29, U-373MO (human glioblastoma), MCF7 (human 
mammary adenocarcinoma) , SKNAS (human neuroblastoma 

30 cultured under the same conditions as COS-3), SK-N-MC 
(human neuroblastoma) , IMR-32 (human neuroblastoma) , 
CHP212 (human neuroblastoma cultured under the same 
conditions as CV-1), Saos-2 (osteosarcoma), SK-OV-3 
(ovarian adenocarcinoma) and SW 480 (human colon 

3 5 adenocarcinoma) . 
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b) COS-3 cells transfected by SR-p70a cDNA 

COS-3 cells were transfected as described in 
EXAMPLE IV. 2. As a control, the cells were transfected 
with pSEl plasmid DNA not containing recombinant SR-p70a 
5 cDNA. 

2) Preparation of protein samples from a eukaryotic cell 
culture or from transfected cells 

After culture, the cells are washed with PBS and 
then taken up in RIPA buffer (PBS with 1% NP40, 0.5% 

10 sodium deoxycholate, 0.5% SDS) supplemented with 10 ng/ml 
RNAse A, 20 ^g/ml DNAse 1, 2 ng/ml aprotinin, 0.5 pg/ml 
leupeptin, 0.7 pg/ml pepstatin and 170 ng/ml PMSP. The 
cells are sonicated by ultrasound at 4°C and left for 
3 0 minutes at 4°C. After microcentrifugation at 

15 12,000 rpm, the supernatant is recovered. The protein 
concentration is measured by the Bradford method. 

3) Western blotting 

5 or 50 iig of proteins (50 pig for the cell lines 
and 5 /zg for transfected cells) are placed in 0.2 volume 
2 0 of the following 6 x electrophoresis buffer: 0.3 5 mM 
Tris-HCl pH 6.8, 10.3% SOS, 36% glycerol, 0.6 mM DTT, 
0.012% bromophenol blue. The samples are applied and run 
in a 10% SDS-PAGE gel (30:0.8 Bis) and then 
electrotransf erred onto a nitrocellulose membrane. 

2 5 4) Visualization with the antibody 

The membrane is incubated for 3 0 minutes in TBST 
blocking buffer (10 mM Tris-HCl pH 8, 150 mM NaCl, 0.2% 
Tween 2 0) with the addition of 5% of milk (GIBCO - SKIM 
MILK) at room temperature. The membrane is brought into 
30 contact successively with the anti-SR-p70 (aSR-p70) 
antibody in the same buffer for 16 hours at 4°C, washed 
3 times for 10 minutes with TBST and then incubated for 
one hour at 3 7 °C with a second, anti- rabbit 
immunoglobulin antibody coupled to peroxidase (SIGMA 

3 5 A055) . After three washes of 15 minutes, the 

visualization is performed using the ECL kit (Amersham 
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RPN2106) by chemiluminescence . 

In parallel/ the same samples were subjected to 
visualization with an anti-p53 (ap53) antibody (Sigma 
BP5312) followed by a second, anti -mouse immunoglobulin 
5 antibody. 

5) Figures and results 

Figure 10 ; Immunoblot of the SR-p70 protein 

Figure 10a ; Detection of the recombinant SR-p70 protein 

- columns 1 and 3; COS -3 transf acted by the vector pSEl. 
10 - columns 2 and 4: COS-3 transf ected by plasmid pSEl 

containing SR-p7 0a cDNA. 

- columns 1 and 2: visualization with the anti-SR-p70 
(«SR-p7 0) antibody. 

- columns 3 and 4: visualization with the anti-p53 (orp53) 
15 antibody. 

Figure 10b ; Detection of the endogenous SR-p7 0 protein 

- columns Is COS-3; 2: CV-1; 3: HT-29; 4: U-373 MG; 5: 
MCF7; 6; SKNAS; 7; SK-N-MC; 8 s IMR-32; 9: CHP212; 10: 
Saos-2; 11: SK-OV-3 and 12: SW480. 

2 0 A: Visualization with the aSR-p70 antibody 
B: Visualization with the orp53 antibody. 

The oSR-p70 antibody specifically recognizes the 
recombinant proteins (Figure 10a) and endogenous proteins 
(Figure 10b) and does not cross with p53 . The analysis of 

2 5 human or monkey cell lines shows the SR-p70 protein, like 

p53, is generally weakly detectable. In contrast, when an 
accumulation of p53 exists, SR-p70 becomes, for its part 
also, more readily detectable (Figure 10b) . A study by 
RT-PCR of the distribution of SR-p70 transcripts shows 

3 0 that the gene is expressed in all the cell types tested. 

EXAMPLE VII 

Cloning of the SR-p7 0 gene and chromosomal localization 
1) Cloning of SR-p70 gene 

The library used is a cosmid library prepared 
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in the EXAMPLE III. 3, with an SR-p70 DNA fragment 
labelled with 32 P with the BRL "Random Primers DNA 
Labelling Systems" kit (reference 18187-013) . The 
hybridization and washing buffers are treated by adding 
5 50% of formaldehyde. The last wash is carried out in 0 . 1 
x SSC/0.1% SDS at 60 °C. In a similar manner, the SR-p70 
gene was isolated from a library prepared with C57 black 
mouse genomic DNA. 

An analysis and a partial sequencing of the 

10 clones demonstrate the presence of 14 excns with a 
structure close to that of the p53 gene, in particular in 
the central portion where the size and positioning of the 
exons are highly conserved (Figure 12) . This structure 
was partially defined in mouse and in man. 

15 As an example, the human genomic sequences of the 

3' region of intron 1, of exon 2, of intron 3 and of the 
5' region of exon 3 are presented in Figure 13. 

2) Chromosomal localization of the SR-p70 gene in man 

This was carried out with human SR-70 gene DNA 

2 0 using the technique described by R. Slim et al . , Hum. 

Genet., 1991, 88, 21-26. Fifty mitoses were analysed, 
more than 80% of which had double spots localized at lp3 6 
on both chromosomes and more especially at Ip3 6.2-lp3 6.3 
(Figure 11) . The identification of chromosome 1 and its 
25 orientation are based on the heterochromatin of the 
secondary constriction. The pictures were produced on a 
Zeiss Axiophot microscope, taken with a LHESA cooled CCD 
camera and treated with Optilab. 

EXAMPLE VIII 

3 0 A) Demonstration of an mRNA coding for a deduced human 

SR-p70 protein possessing both a shorter N- terminal end 
and a divergence. 
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1) Culturing of IMR-32 (human neuroblastoma) cells 

The cells were cultured as described in the 
catalogue "Catalogue of cell lines and hybridomas, 7th 
edition, 1992" of the ATCC (American Type Culture 
5 Collection) . 



2) Preparation of the cDNA 

The RNA is prepared as described in Example 
I. 2. a. The cDNA is prepared in a manner similar to that 
described in Example 1.3, with 5 fig total RNA in a final 
10 volume of 20 ill using a poly(T) 12 primer and with cold 
nucleotides. The reaction is not interrupted with EDTA. 

3 ) Specific amplification of SR-p70 cDNA by the so-called 
PCR technique 

The polymerization is carried out with 2 fil of 
15 cDNA in 50 fil final with the buffer of the following 
composition: 50 mM Tris-HCl pH 9.2, 16 mM (NH 4 ) 2 S0 4 , 
1.7 5 mM MgCl 2 , in the presence of 10% DMSO, 0.4 mM NTP, 
100 ng of each of the two nucleic acid primers and 3.5 
units of the mixture of TAQ and PWO polymerases 
2 0 (Boehringer Mannheim, ref . 1681 842) . 

The primer pair is of the following composition: 

•ansa pria«r: AOOCCSaCQTSaSGAAS (position IS to 32, Figura 6) 
antii.ni* priwrs CTTSOCSATCTSQCAQTAO (position 503 to 485, Pigur* S) . 

The reaction is carried out for 3 0 cycles at 
2 5 95°C/30 seconds, 58°C/1 minute and 68 8 C/2 minutes 
30 seconds, followed by a final cycle of 68°C/10 minutes. 

The PCR product is subjected to electrophoresis 
on a 1% agarose gel (TAB buffer) . After ethidium bromide 
staining, two major bands are revealed: a band 
30 approximately 490 bp in size (expected size (see Figure 
6)) and an additional band approximately 700 bp in size. 
The latter is extracted from the gel using the 
"Geneclean" kit (Bio 101, ref 1001 400) . After a 
desalting on a column of polyacrylamide P10 (Biorad, ref 
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15011050) , the fragment is subjected to a further PCR 
amplification for 10 cycles as described above. 



4) Determination of the sequence of the amplified product 
In a first step, the PCR product is removed from 
5 the oligonucleotides on a column of Sephacryl S-400 
(Pharmacia 17-0609-01) and then desalted on a column of 
P10. The sequencing reaction is carried out using the 
Applied Biosystems kit (ref . 401 628) (373 DNA sequencer) 
with the antisense primer. 

10 The sequence obtained is identical to the SR-p7 0 

cDNA sequence (Example II. 4) with an insertion of 198 bp 
between positions 217 and 218 (Figure 14) . The deduced N- 
terminal protein sequence (sequence designated SR-p70d) 
is 49 amino acids shorter, with a divergence of the first 

15 13 amino acids (sequence ID No. 13) . There is hence 
coexistence of at least two different SR-p70 transcripts 
as already described for the mouse AtT-20 line. 

B) Cloning of human SR-p70 and demonstration of an mRNA 
coding for a deduced human SR-p70 protein possessing the 

2 0 same N- terminal end as SR-p70d and a divergence in the C- 

terminal portion 

1) Specific amplification of SR-p70 cDNA by the so-called 
PCR technique 

The amplification was carried out as described in 
25 EXAMPLE VIII. A from purified RNA of IMR-32 cells with the 
primer pair of the following composition: 

sense primer: GCQ GCC ACG ACC GTG AC (position 160 
to 17 6, sequence ID No. 11) 

antisense primer: GGC AGC TTG GGT CTC TGG (position 
30 1993 to 1976, Figure 6). 

After removal of the excess primers on an S400 
column and desalting on a P10 column, 1 nl of the sample 
is subjected again to a PCR with the primer pair of the 
following composition: 

3 5 sense primer: TAT CTC GAG CTG TAC GTC GGT GAC CCC 

Xhol (position 
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263 to 280, sequence ID No. 11) 

antisense primer: ata tct aga tca gtg gat ctc ggc ctc 

Xbal (position 
1943 to 1926, Figure 6). 

5 2) Cloning of the amplified product into plasmid pCDNA3 
The PCR product obtained in 1) is desalted on a 
P10 column, digested with the restriction enzymes Xhol 
and Xbal and then cloned into plaamid pCDNA3 as described 
in EXAMPLE II. 5. Two recombinant clones are sequenced 

10 using the Applied Biosystems kit with the oligonu- 
cleotides specific for SR-p70 cDNA. 

The first sequence obtained corresponds to the 
complete sequence of the mRNA coding for SR-p7 0 described 
in EXAMPLE VIII. a. The deduced protein contains 5 87 amino 

15 acids (sequence ID No. 13 and Figure 16) . 

The second sequence obtained is identical to the 
SR-p7 0d cDNA sequence described above, but with two 
deletions, of 149 bp and of 94 bp between positions 1049 
and 1050 on the one hand, and between' positions 1188 and 

2 0 1189 on the other hand (sequence ID No. 14 and Figure 
15) . The protein sequence deduced from this second 
sequence reveals a protein having an N- terminal portion 
49 amino acids shorter, with a divergence in the first 13 
amino acids as well as a divergence of protein sequence 

2 5 between amino acids 3 50 and 397 (sequence ID No. 15 and 
Figure 16) (sequence designated SR-p70e) . The deduced 
protein contains 506 amino acids. 

C) Demonstration of an mRNA coding for a deduced human 
SR-p70 protein possessing a shorter N- terminal end 



30 1) Culturing of SK-N-SH (human neuroblastoma ) cells 

The cells are cultivated as described in the 
"Catalogue of cell lines and hybridomas, 7th edition, 
19 92" of the ATCC (American Type Culture Collection) . 
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2) Preparation of the cDNA and amplification of SR-p70 
cDNA by the so-called PCR technique 

These steps are carried out as described in 
EXAMPLE VIII. A with the primer pair of the following 
5 composition: 

sense primer: AGG GGA CGC AGC GAA ACC (position 12 8 
to 145, Figure 17) 

antisense primer: GGC AGC TTG GGT CTC TGG (position 

1993 to 1976, Figure 6) . 
10 The sequencing is carried out with the Applied 

Biosystem kit with primers specific for SR-p70 cDNA, and 
reveals two cDNAs : 

- a first cDNA corresponding to the mRNA coding for SR- 
p70a 

15 - a second cDNA having a deletion of 98 bp between 
positions 24 and 25 (sequence ID No. 16 and Figure 15) . 

This deletion comprises the translation 
initiation ATG of SR-p70a. The protein deduced 
(designated SR-p70f) from this second cDNA possesses a 

2 0 translation initiation ATG downstream corresponding to an 
internal ATG of SR-p70a. The deduced protein hence 
contains 588 amino acids (sequence ID No. 17 and Figure 
16) and is truncated with respect to the 48 N- terminal 
amino acids of SR-p70a. 

25 D) Demonstration of an mRNA coding for human SR-p70b 

1) Culturinq of K562 cells 
The cells are cultured as described in the 

"Catalogue of cell lines and hybridomas, 7th edition, 
1992" of ATCC (American Type Culture Collection) . 

2) Preparation of the cDNA, amplification of SR-p7 0 cDNA 
by the so-called PCR technique and sequencing 

These steps are carried out as described in 
EXAMPLE VIII. C. 

The sequencing reveals two cDNAs : 
A first cDNA corresponding to the mRNA coding for SR- 
p7 0a, and a second cDNA having a deletion of 94 bp 



30 



35 
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between positions 1516 and 1517 (sequence ID No. 18 and 
Figure 15) . The deduced protein (designated SR-p70b) 
contains 199 amino acids and possesses a C-terminal 
sequence truncated by 137 amino acids relative to SR- 
5 p7 0a, with the last 4 amino acids divergent (sequence ID 
No. 19 and Figure 21) . 

This cDNA is similar to the one described in 
EXAMPLE I relating to monkey SR-p7 0b. 

The molecules described in this example (EXAMPLE 

10 VIII. A, B, C and D) reveal SR-p70 variants which are the 
outcome of differential splicings of the primary mRNA, 
transcribed by the SR-p70 gene. 

The SR-p70a is encoded by an mRNA composed of 14 
exons (see EXAMPLE VII) . This is the reference protein. 

15 SR-p7 0b is the outcome of an insertion between exons 3 
and 4 and of the absence of exons 11 and 13. SR-p70f is 
the outcome of the absence of exon 2 . This example 
describes the existance of SR-p70 variants non- 
exhaustively, with a strong probability of existence of 

20 other variants. Similarly, the existence of these 
variants described in this example, as well as SR-p70a, 
is not limited to the lines in which they have been 
demonstrated. In effect, studies performed by RT-PCR 
showed that these variants are to be found in the various 

25 lines studied. 

Furthermore/ the initiation methionine of SR-p70f 
corresponds to an internal methionine of SR-p7 0a, 
suggesting the possibility of initiation downstream on 
the mRNA coding for SR-p7 0a. 

3 0 EXAMPLE IX 

Obtaining a 5' sequence of human SR-p70a mRNA 

1) Amplification of the 5' end of SR-p70 cDNA by PCS. 

The cell culturing and the preparations of total 
RNA and of cDNA are carried out as described in EXAMPLE 
35 VIII. 1 and 2. The RNA template is hydrolysed by 
incubation for 5 minutes at 65°C after the addition of 
4 ill of 500 mM EDTA and 4 pi of 2 N NaOH. The sample is 
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then desalted on a P10 column. The cDNA is elongated at 
the 3' end with a dG tail as described in EXAMPLE I.3.d, 
in a final volume of 40 /xl . After the addition of 4 til of 
500 mM EDTA and 4 (il of 2 N NaOH, the cDNA is incubated 
5 at 65°C for 3 minutes and then desalted on a P10 column. 
PCR amplification is carried out as described in EXAMPLE 
VIII. 3 with 8 fil of cDNA and for 30 cycles with the 
primer pair of the following composition: 

sense primer: CCCCCCCCCCCCCCN (where 
10 N equals G, A or T) 

antisense primer: CCATCAGCTCCAGGCTCTC (position 1167 
to 1149, Figure 6) . 

After removal of the excess primers on an S-400 
column and desalting on a P10 column/ 1 /xl of the sample 
15 is subjected again to a PCR with the pair of the 
following composition: 

sense primer: CCCCCCCCCCCCCCN 
antisense primer: CCAGGACAGGCGCAGATG (position 92 8 
to 911, Figure 6} . 
20 The sample, passed again through an S-400 column 

and a P10 column, is subjected to a third amplification 
for 2 0 cycles with the following pair: 

sense primer: CCCCCCCCCCCCCCCN 
antisense primer: CTTGGCGATCTGGCAGTAG (position 503 
25 to 485, Figure 6) . 

2) Determination of the SR-p70 cDNA 5' sequence 

The sequence is produced as described in EXAMPLE 
VIII. 4. This sequence reveals a non-coding 5' region of 
at least 237 bases upstream of the initiation ATG of SR- 

3 0 p70a (Figure 17) . By comparison of this sequence 
(obtained from the line IMR-32) with the one obtained 
from the line HT-2 9 in particular (Figure 6) , two point 
differences (Figure 17: see bold characters) are revealed 
(G A and C -* T) , positioned, respectively, at -20 and 

35 -30 from the initiation ATG of SR-p70a (Figures 6 and 
17) . This variability is located in exon 2 (Figure 13) . 
It is not ruled out that this variability is also to be 
found within a coding frame as the outcome of an 
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alternative splicing as described in EXAMPLES III in 
mouse and VIII in man, or alternatively as the outcome of 
a translation initiation on a CTQ (as has been 
demonstrated for FGFb (Proc. Natl. Acad. Sci USA, 1989, 
5 86, 1836 - 1840) ) . 

Similarly, it is not ruled out that this 
variability has a repercussion on the translation of 
SR-p70 or on the splicing of the primary RNA. 

At all events, this variability, probably of 
10 allelic origin, may serve as a marker, either at genomic 
level (see EXAMPLE XI) or at mRNA level (see EXAMPLE X) . 



EXAMPLE X 

1) Analysis by PCR of the transcriptional expression of 
SR-p70a in cell samples (RT-PCR) 
15 Cell culturing (SK-N-AS, SK-N-MC, HT-29, U-373MG, 

3W480, IMR-32, CHP212) is carried out as described in 
Example VI. 1. a (referred to the catalogue "Catalogue of 
cell lines and hybridomas, 7 th edition 1992" of the 
ATCC) . 

2 0 The preparation of the cDNA and the PCR 

amplification are carried out as described in EXAMPLE 
VIII. 2 and 3. The primer pair used is of the following 
composition: 

sense primer: AGGGGACGCAGCGAAACC (position 128 to 
25 145, Figure 17) 

antisense primer: GGCAGCTTGGGTCTCTGG (position 1993 
to 1976, Figure 6) . 

The samples are analysed by electrophoresis on a 
1% agarose gel and visualization with ethidium bromide 
30 (Figure 18) . 

The size of the band obtained in the samples 
corresponds to the expected size { approximately 2 Jcb, 
Figures 6 and 17) . The intensity of the bands obtained is 
reproducible. A reamplif ication of 1 pi ot the sample 

3 5 under the same conditions for 20 cycles reveals a band in 

each of the samples. 
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2) Determination of the sequence of the amplified 
products 

After passage of the samples through S-400 and 
P10 columns, sequencing is carried out on an Applied 
5 Biosystems sequencer 373 with the reference kit 401 628. 
The primers used are, inter alia, the following: 



position Figure 

AGGGGACGCAGCGAAACC 128 to 145 22 

CTTGGCGATCTGGCAGTAG 503 to 485 6 

GATGAGGTGGCTGGCTGGA 677 to 659 6 

10 CCATCAGCTCCAGGCTCTC 1167 to 1149 6 

TGGTCAGGTTCTGCAGGTG 1605 to 1587 6 

GGCAGCTTGGGTCTCTGG 1993 to 1976 6 



No protein difference in the SR-p70a was 
detected. However, sequences obtained reveal a double 

15 variability at positions -20 and -30 upstream of the 
initiation ATG of SR-p70a (Figures 6 and 17) . This 
variability, probably of allelic or gin, enables two 
classes of transcripts to be defined: a first class 
possessing a G at position -30 and a C at position -20 

20 (class G" 30 /C" 20 ) and a second class possessing a 
difference at two positions: an A at -30 and a T at -20 
(class A- 30 /T- 20 ) . 

First class: SK-N-AS, SK-N-MC, HT-29, U-373MG, SW480. 
Second class: IMR-32, CEP212. 

2 5 EXAMPLE XI 

Analytical method of determination of the allelic 
distribution of the SR-p70 gene in a population of 10 
persons 

This allelic distribution is based on the allelic 

3 0 variability demonstrated in EXAMPLES IX and X: 

• G" 30 /C" 20 allele possessing, respectively, a G and a 
C at positions -30 and -20 upstream of the 
initiation ATG of SR-p70a. 

• A" 30 /T" 20 allele possessing, respectively, an A and 
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a T at the same positions. 

This variability may be demonstrated by the use of 
restriction enzymes that differentiate the two 
alleles (Figure 13) . As an example: 
5 • Enzyme Bpl I having a cleavage site only on the 
G -30/ c -20 a n e i e ± n the zone of interest (this site 
encompasses both variable positions) . 
• Enzyme Sty I having a cleavage site only on the 
A -30 /T -20 a n e ie i n the zone of interest. 

10 1) Genomic amplification of exon 2 by PCR 

The polymerization reaction is carried out with 
500 ng of purified genomic DNA, in 50 fil final with the 
conditions described in Example VIII. 3. 

The primer pair is of the following position: 

15 Sanaa primar: CACCTACTCCAGGGATGC (position 1 to 18, Figura 13) 

Antisaasa primar: AGGAAAATAGAAGCGTCAGTC (position 833 to 813, Figura 13) . 

The reaction is carried out for 30 cycles as described in 
EXAMPLE VIII. 3. 

After removal of the excess primer on an S-40 0 
20 column and desalting on a P10 column, 1 fil of the sample 
is amplified again for 25 cycles under the same 
conditions with the following primer pair: 

Sana* priaars CAGGCCCACTTOCCTGCC (position 25 to 32, Figura 13) 

Antisaasa priaar: CTGTCCCCAAGCTGATGAQ (position 506 to 488, Figura 13) . 

2 5 The amplified products are subjected to 

electrophoresis on a 1% agarose gel (Figure 19-A) . 

2) Digestion with the restriction enzyme Styl 

The samples are desalted beforehand on a P10 
column and then digested with the restriction enzyme Styl 
30 (BRL 15442-015) in the buffer of the following 
composition: 50 mM Tris-HCl pH 8, 100 mM NaCI, 10 mM 
MgCl 2 , at 37°C for 30 min. The digestion products are 
analysed by electrophoresis on a 1% agarose gel (TAE 
buffer) . Visualization is carried out by ethidium bromide 
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staining (Figure 19-B) . 

A band of 482 base pairs characterizes the 
G -30 /c -20 allele (figures 13 and 19) . The presence of a 
band of 376 base pairs and a band of 106 base pairs 
5 characterize the A~ 30 /T~ 20 allele (allele possessing a 
Styl cleavage site) . 

On the population of 10 persons, 2 persons 
exhibit the G- 30 /C -20 and A _30 /T* 20 alleles, the other 8 
persons being homozygous with the G" 30 /C" 20 allele. The 
10 study of a fresh population of 9 persons demonstrated 3 
heterozygous persons exhibiting the G" 30 /C* 20 and A" 30 /T- 20 
alleles, the other 6 persons being homozygous for the 

G -30 /c -20 allele . 
EXAMPLE XII 

15 Test of reversion of transformation of the line SK-N-AS 
by transfection with SR-p70 cDNA 

The expression vector used is described in 
EXAMPLE II. 5 and shown diagrammatically in Figure 15. The 
method used is the so-called calcium phosphate method 

2 0 described by Graham et al. (Virology 1973, 54, 2, 

536-539) . The line is inoculated in the proportion of 
5 x 10 s cells per dish 6 cm in diameter in 5 ml of the 
medium described in Example 1.1. The cells are cultured 
at 37 °C and with 5% C0 2 overnight. The transfection 

25 medium is prepared in the following manner: the following 
mixture is prepared by adding, in order, 1 ml of HEBS 
buffer (8 mg/ml NaCl, 37 0 ng/ml KC1, 125 /xg/ml 
Na 2 HP0 4 .2H 2 0, 1 mg/ml dextrose, 5 mg/ml Hepes pH 7.05), 
10 pig of the plasmid to be transfected and 50 ftl of 2.5 M 

30 CaCl 2 added dropwise. The transfection medium is left for 
3 0 min at room temperature and then added dropwise to the 
medium contained in the culture dish. The cells are 
incubated for 5 to 6 hours at 37°C/5% C0 2 . After the 
medium is aspirated off, 5 ml of fresh medium containing 

3 5 2% of foetal bovine serum are added. After 48 hours at 

37°C/5% C0 2 , the cells are rinsed with PBS, detached by 
trypsinization, diluted in 10 ml of culture medium (5% 
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foetal bovine serum) and plated out in a dish 10 cm in 
diameter {the dilution may be adjusted in accordance with 
the efficiency of transf ection) . After a further 
incubation for 10 hours (the time for the cells to 
adhere) , the cells are subjected to selection by adding 
G418 at a final concentration of 600 ng/ml Geneticin 
equivalent for 15 to 21 days (the medium is changed every 
day) . The clones obtained are then rinsed with PBS, fixed 
in 70% ethanol, dried, stained with 1% crystal violet and 
then counted. 

Four plasmid transf ections were carried out in 
duplicate: 

- plasmid pCDNA3 without insert 

- plasmid pCDNA3/SR-p70 containing human SR-p7 0a 
cDNA 

- plasmid pCDNA3/SR-p70 Mut containing SR-p7 0a cDNA 
possessing a mutation at position 293 AA (R -* H) 
which is analogous to the mutation 273 (R H) in 
the DNA- binding domain of p53 

control without plasmid. 

The result is expressed as the number of clones 
per dish. 





Sxp«ria«nt 1 


Sxparlmant 2 


Ha&n 


pCDNA3 


172 


353 


262 


pCDHX3/SR-p70 


13 


3 


10 


P CDKX3/SR-p70 Hut 


92 


87 


89 


Abs«ne« of plasmid 


1 


3 


2 



The number of clones obtained by transfection 
with plasmid pCDNA3/SR-p70 is 25 -fold less than the 
number of clones obtained with the control pCDNA3 and 
9-fold less than the number of clones obtained with 
pCDNA3/SR-p7 0 Mut, indicating a mortality or an arrest of 
cell division of the cells transfected with SR-p70 cDNA. 
This result is not the consequence of a toxicity in view 
of the clones obtained with the mutated SR-p70 cDNA, but 
probably of an apoptosis as has been demonstrated for the 



p53 protein 
1981) . 
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(Koshland et al., 



Sciences, 1993, 262, 1953- 



EXAMPLE XIII 

Biological role of the SR-p70 protein 
5 The structural homology between the DNA-binding 

domain of p53 and the central region of the SR-p70 
protein enables it to be inferred that SR-p70 is a 
transcription factor (see Figures 1 and 2} . In effect, 
p53 (393 amino acids) consists of several functional 

10 domains. The N- terminal region (1-91 amino acids) is 
involved in the activation of transcription, and contains 
sites for interaction with different cellular and viral 
proteins. The central portion (amino acids 92 to 2 92) 
permits binding to the specific DNA sequences located in 

15 the promoter regions of certain genes (the majority of 
point mutations that inactivate p53 are localized in this 
region) , and also possesses numerous sites for 
interaction with viral proteins which inhibit its 
activity. Finally, the last 100 amino acids of p53 are 

2 0 responsible for its oligomerization as well as for the 
regulation of the latter (Hainaut P., Current Opinion in 
Oncology, 1995, 7, 76-82; Prokocimer M. , Blood, 1994, 84 
No. 8, 2391-2411) . 

The sequence homology between p53 and SR-p70 is 

2 5 significant, in particular as regards the amino acids 
involved directly in the interaction with DNA, suggesting 
that SR-p70 binds to the p53 sites on DNA. These amino 
acids correspond very exactly to what are referred to as 
the "hot spots", amino acids frequently mutated in h um a n 

30 tumours (SWISS PROT: SW: P53_human and Prokocimer M., 
Blood, 1994, 84 No. 8, 2391-2411). From this homology, it 
may be deduced that the SR-p7 0 protein exerts a control 
over the activity of the genes regulated by p53, either 
independently of the latter or by forming heterooligomers 

35 with it. 

Consequently, like p53, the products of the SR- 
p70 gene must be involved in the control and regulation 
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of the cell cycle, causing the cycle to stop (momentarily 
or permanently) , and the implementation of programmes 
such as DNA repair, differentiation or cell death. The 
likelihood of the existence of "p53-like" activities had 
5 been strongly felt with the demonstration in p53" /_ mice 
of activities of DNA repair and cell death in response to 
ionizing radiations (Strasser et al . , Cell, 1994, 79, 
329-33 9) . The authors of the present invention have 
localized the human SR-p70 gene in the telomeric region 

10 of the short arm of chromosome 1, precisely at lp36.2- 
36.3, the smallest deleted region (SRO) common to a 
majority of neuroblastomas and of other types of tumours 
(melanomas and sarcomas) (White et al . , PNAS, 1995, 92, 
5520-5524) . This region of loss of heterozygosity (LOH) 

15 defines the locus of a tumour -suppressing gene whose loss 
of activity is considered to be the cause of tumour 
formation. It is important to recall that this region is 
also subject to "maternal imprinting"; the maternal 
allele is preferentially lost in neuroblastomas having 

2 0 the lp3 6 deletion (without amplification of N-Myc) (Caron 
et al., Hum. Mol . Gen., 1995, 4, 535-539). The wide-type 
SR-p70 gene introduced into neuroblastoma cells and 
expressed therein permits the reversion of their trans- 
formation. The loss of this anti -oncogenic activity is 

25 hence associated with the development of the tumour. The 
lp3 6 region possesses a syngeneic homology with the 
distal segment of the mouse chromosome 4. In this region, 
the curly tail (ct) gene (Beier et al . , Mammalian Genome, 
1995, 6, 269-272) involved in congenital malformations of 

30 the neural tube (NTM: spina bifida, anencephaly, etc) . 
The ct mouse is the best animal model for studying these 
malformations. It is accepted that these malformations 
result from abnormalities of cell proliferation. Bearing 
in mind the nature of the SR-p70 gene and its chromosomal 

35 localization, one of the hypotheses is that SR-p70 could 
be the human homologue of ct and that, on this basis, the 
detection of early mutations and chromosomal 
abnormalities affecting this gene should permit, for 
example, as an application, the identification of persons 
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at risk (0.5-1% of newborn babies affected by NTM) and 
the implementation of preventive treatments (Neumann et 
al., Nature Genetics, 1994, 6, 357-362; Di Vinci et al . , 
Int. J. Cancer, 1994, 59. 422-426; Moll et al . , PNAS, 
1995, 92, 4407-4411; Chen et al . , Development, 1995, 121, 
681-691) . 



EXAMPLE XIV 



Allelic study of the SR-p70 gene 

The GC and AT alleles are readily identified by 
Styl restriction of the PCR products of exon 2 (see 
Example XI) . Hence it was possible to determine in this 
way, in GC/AT heterozygous individuals bearing 
neuroblastoma tumours, the lost SR-p70 allele (GC or AT), 
in spite of the presence of contaminating healthy tissue. 

Surprisingly, when the same analysis is carried 
out on the RNA, a single allele is demonstrated 
independently of the presence or otherwise of a deletion 
and, still more surprisingly, in spite of the presence of 
healthy tissue. This suggests that the imprint 
(differential expression of the two alleles) would also 
exist in the contaminating tissue. 

In order to verify this, the same analysis was 
repeated on the RNA originating from blood cells of 
healthy GC/AT heterozygous individuals. Only one of the 
two types of transcript was detected also in these cells. 
This result confirms the observation made on the tumour 
samples regarding the existence of a generalized genetic 
imprint for the SR-p70 gene. 

The implications of this discovery are important, 
since it enables it to be postulated that a single 
sporadic mutation inactivating the active SR-p70 allele 
will give rise to a loss of activity, this potentially 
occurring in all the tissues. 

The absence of precise data on the biological 
function of SR-p70 does not enable the consequences of 
this loss of SR-p70 activity for the cell to be measured. 
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Nevertheless, its strong homology with the p53 tumour- 
suppressing protein, as well as the demonstration that 
SR-p70 is a transcription factor capable of utilizing the 
P21 waf promoter, suggests a role of this protein in the 
5 control of the cell cycle and in differentiation. 

Knudson and Meadows, 1980 (New Eng. J. Med. 3 02: 
1254-56) , consider the IV- S neuroblastomas to be a 
collection of non-malignant cells from the neural crest 
carrying a mutation which interferes with their normal 

10 differentiation. 

It is conceivable that the loss of SR-p70 
activity, like the loss of p53 control over the cell 
cycle, favours the appearance of cellular abnormalities 
such as aneuploidy, amplification (described in the case 

15 of neuroblastomas) and other genetic reorganizations 
capable of causing cell transformation (Livingstone et 
al., 1992, Cell 71:923-25; Yin et al. 1992, Cell 72:937- 
48; Cross et al . 19 95, Science 267:1353-56; Fukasawa et 
al. 1996, Science 271:1744-47). Neuroblastomas might 

2 0 hence arise originally from a temporary or permanent loss 

of activity of SR-p70, thereby favouring the occurrence 
of oncogenic events and hence tumour progression. 

In the case of the lp36 constitutional deletion 
described by Biegel et al., 1993 (Am. J. Hum. Genet. 
25 52:176-82), IV-S neuroblastoma does indeed occur and the 
gene affected is NBS-1 (SR-p70) . 

In conclusion, what is described for 
neuroblastomas might also apply to other types of 
tumours, in particular those associated with 
30 reorganization of the end of the short arm of chromosome 
1 (Report 2 international workshop on human chr 1 mapping 
1995, Cytogenetics and Cell Genet. 72:113-154). Prom a 
therapeutic standpoint, the involvement of SR-p7 0 in the 
occurrence of tumours should lead to the avoidance of the 

3 5 use of mutagenic agents in chemotherapy, bearing in mind 

the risks of cell transformation by these products, and 
to the use, in preference to these products, of non- 
mutagenic substances which stimulate differentiation. 

Moreover, the frequency of occurrence of the GC 
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and AT alleles is as follows: in the population, 
Frequency (AT) -0.15, and on a sample of 25 (neuroblastoma) 
patients, F (AT) =0.30. These statistics indicate that the 
AT allele could be a predisposing factor. 
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SEQUENCE LISTING 



) GENERAL INFORMATION: 

;i) APPLICANT: 

(A) NAME: sanofi 

(3) STPEET : 32-34 rue Marbeuf 

(C) CITY: PARIS 

(E) COUNTRY: FRANCE 

( F) POSTAL CODE (ZIP): 75008 

(G) TELEPHONE: 01 53 77 40 00 

(H) TELEFAX: 01 53 77 41 33 

(ll) TITLE OF INVENTION: SR-p70 
(in) NUMBER OF SEQUENCES: 40 

(IV) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2874 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: -DNA 



(ix) FEATURE: 

(A) NAME / KEY : CDS 

(3) LOCATION: 156.. 2066 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
TGCCTCCCCG CCCGCGCACC CGCCCCGAGG CCTGTGCTCC TGCGAAGGGG ACGCAGCGAA 60 
GCCGGGGCCC GCGCCAGGCC GGCCGGGACG GACGCCGATG CCCGGAGCTG CGACGGCTGC 12 0 

173 



ACC TCC CCC GAT GGG GGC ACC ACG TTT GAG CAC CTC TGG AGC TCT CTG 

Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu His Leu Trp Ser Ser Leu 

10 15 20 

GAA CCA GAC AGC ACC TAC TTC GAC CTT CCC CAG TCA AGC CGG GGG AAT 

Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro Gin Ser Ser Arg Gly Asn 

25 30 35 

AAT GAG GTG GTG GGT GGC ACG GAT TCC AGC ATG GAC GTC TTC CAC CTA 

Asn Glu Val Val Gly Gly Thr Asp Ser Ser Met Asp Val Phe His Leu 

40 45 50 

GAG GGC ATG ACC ACA TCT GTC ATG GCC CAG TTC AAT TTG CTG AGC AGC 

Glu Gly Met Thr Thr Ser Val Met Ala Gin Phe Asn Leu Leu Ser Ser 

55 60 65 70 

ACC ATG GAC CAG ATG AGC AGC CGC GCT GCC TCG GCC AGC CCG TAC ACC 

Thr Met A3p Gin Met Ser Ser Arg Ala Ala Ser Ala Ser Pro Tyr Thr 
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CCG GAG CAC GCC GCC AGC GTG CCC ACC CAT TCA CCC TAC GCA CAG CCC 4 61 

t>ro Glu His Ala Ala Ser Val Pro Thr His Ser Pro Tyr Ala Gin Pro 
90 95 100 

AGC TCC ACC TTC GAC ACC ATG TCG CCC GCG CCT GTC ATC CCC TCC AAC 5 09 

Ser Ser Thr Phe Asd Thr Met Ser Pro Ala Pro Val lie Pro Ser Asn 
105 " 110 115 

ACC GAC TAT CCC GGA CCC CAC CAC TTC GAG GTC ACT TTC CAG CAG TCC 557 
Thr Asp Tyr Pro Gly Pro His His Phe Glu Val Thr Phe Gin. Gin Ser 
120 125 130 

AGC ACG GCC AAG TCA GCC ACC TGG ACG TAC TCC CCA CTC TTG AAG AAA 60 5 

Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser Pro Leu Leu Lys Lys 
135 140 145 150 

CTC TAC TGC CAG ATC GCC AAG ACA TGC CCC ATC CAG ATC AAG GTG TCC 65 3 

Leu Tyr Cys Gin lie Ala Lys Thr Cys Pro He Gin He Lys Val Ser 
1S5 160 165 

GCC CCA CCG CCC CCG GGC ACC GCC ATC CGG GCC ATG CCT GTC TAC AAG 701 
Ala Pro Pro Pro Pro Gly Thr Ala He Arg Ala Met Pro Val Tyr Lys 
170 175 180 

AAG GCG GAG CAC GTG ACC GAC ATC GTG AAG CGC TGC CCC AAC CAC GAG 749 
Lys Ala Glu His Val Thr Asp He Val Lys Arg Cys Pro Asn His Glu 
135 190 195 

CTC GGG AGG GAC TTC AAC GAA GGA CAG TCT GCC CCA GCC AGC CAC CTC 797 
Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro Ala Ser His Leu 

200 205 210 

ATC CGT GTG GAA GGC AAT AAT CTC TCG CAG TAT GTG GAC GAC CCT GTC 8 45 

He Arg Val Glu Gly Asn Asn Leu Ser Gin Tyr Val Asp Asp Pro Val 
215 220 225 230 

ACC GGC AGG CAG AGC GTC GTG GTG CCC TAT GAG CCA CCA CAG GTG GGG 3 93 

Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro Pro Gin Val Gly 
235 240 245 

ACA GAA TTC ACC ACC ATC CTG TAC AAC TTC ATG TGT AAC AGC AGC TGT 941 
Thr Glu Phe Thr Thr He Leu Tyr Asn Phe Met Cys Asn Ser Ser Cys 
250 255 260 

GTG GGG GGC ATG AAC CGA CGG CCC ATC CTC ATC ATC ATC ACC CTG GAG 989 
Val Gly Gly Met Asn Arg Arg Pro He Leu He He lie Thr Leu Glu 
265 270 275 

ACG CGG GAT GGG CAG GTG CTG GGC CGC CGG TCC TTC GAG GGC CGC ATC 10 37 

Thr Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe Glu Gly Arg He 
280 285 290 

TGC GCC TGT CCT GGC CGC GAC CGA AAA GCC GAT GAG GAC CAC TAC CGG 108 5 

Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu Asp His Tyr Arg 

295 300 305 310 

GAG CAG CAG GCC TTG AAT GAG AGC TCC GCC AAG AAC GGG GCT GCC AGC 1133 
Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys Asn Gly Ala Ala Ser 
315 320 325 

AAG CGC GCC TTC AAG CAG AGT CCC CCT GCC GTC CCC GCC CTG GGC CCG 1181 
Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala Val Pro Ala Leu Gly Pro 
330 335 340 

GGT GTG AAG AAG CGG CGG CAC GGA GAC GAG GAC ACG TAC TAC CTG CAG 1229 
Gly Val Lys Lys Arg Arg His Gly Asp Glu Asp Thr Tyr Tyr Leu Gin 
345 350 355 

GTG CGA GGC CGC GAG AAC TTC GAG ATC CTG ATG AAG CTG AAG GAG AGC 1277 
Val Arg Gly Arg Glu Asn Phe Glu He Leu Met Lys Leu Lys Glu Ser 
360 365 370 
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CTG GAG CTG ATG GAG TTG GTG CCG CAG CCG CTG GTA GAC TCC TAT CGG 13 25 

Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val Asp Ser Tyr Arg 
375 380 385 390 

CAG CAG CAG CAG CTC CTA CAG AGG CCG AGT CAC CTA CAG CCC CCA TCC 1373 

Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser Kis Leu Gin Pro Pro Ser 

395 400 405 

TAC GGG CCG GTC CTC TCG CCC ATG AAC AAG GTG CAC GGG GGC GTG AAC 1421 

Tyr Gly Pro Val Leu Ser Pro Met Asn Lys Val His Gly Gly Val Asn 
410 415 420 

AAG CTG CCC TCC GTC AAC CAG CTG GTG GGC CAG CCT CCC CCG CAC AGC 14 69 

Lys Leu Pro Ser Val Asn Gin Leu Val Gly Gin Pro Pro Pro His Ser 
425 430 435 

TCG GCA GCT ACA CCC AAC CTG GGA CCT GTG GGC TCT GGG ATG CTC AAC 1517 

Ser Ala Ala Thr Pro Asn Leu Gly Pro Val Gly Ser Gly Met Leu Asn 
440 445 450 

AAC CAC GGC CAC GCA GTG CCA GCC AAC AGC GAG ATG ACC AGC AGC CAC 15 65 

Asn His Gly His Ala Val Pro Ala Asn Ser Glu Met Thr Ser Ser His 

455 460 465 470 

GGC ACC CAG TCC ATG GTC TCG GGG TCC CAC TGC ACT CCG CCA CCC CCC 1613 

Gly Thr Gin Ser Met Val Ser Gly Ser His Cys Thr Pro Pro Pro Pro 
475 480 485 

TAC CAC GCC GAC CCC AGC CTC GTC AGT TTT TTA ACA GGA TTG GGG TGT 1661 

Tyr His Ala Asp Pro Ser Leu Val Ser Phe Leu Thr Gly Leu Gly Cys 
490 495 500 

CCA AAC TGC ATC GAG TAT TTC ACG TCC CAG GGG TTA CAG AGC ATT TAC 1709 

Pro Asn Cys lie Glu Tyr Phe Thr Ser Gin Gly Leu Gin Ser lie Tyr 
505 510 515 

CAC CTG CAG AAC CTG ACC ATC GAG GAC CTG GGG GCC CTG AAG ATC CCC 175 7 

His Leu Gin Asn Leu Thr lie Glu Asa Leu Gly Ala Leu Lys lie Pro 
520 525 530 

GAG CAG TAT CGC ATG ACC ATC TGG CGG GGC CTG CAG GAC CTG AAG CAG 18 0 5 

Glu Gin Tyr Arg Met Thr lie Trp Arg Gly Leu Gin Asp Leu Lys Gin 
535 540 545 550 

GGC CAC GAC TAC GGC GCC GCC GCG CAG CAG CTG CTC CGC TCC AGC AAC 13 5 3 

Gly His Asp Tyr Gly Ala Ala Ala Gin Gin Leu Leu Arg Ser Ser Asn 
555 560 565 

GCG GCC GCC ATT TCC ATC GGC GGC TCC GGG GAG CTG CAG CGC CAG CGG 1901 

Ala Ala Ala lie Ser He Gly Gly Ser Gly Glu Leu Gin Arg Gin Arg 
570 575 580 

GTC ATG GAG GCC GTG CAC TTC CGC GTG CGC CAC ACC ATC ACC ATC CCC 194 9 

Val Met Glu Ala Val His Phe Arg Val Arg His Thr He Thr He Pro 
585 590 595 

AAC CGC GGC GGC CCC GGC GCC GGC CCC GAC GAG TGG GCG GAC TTC GGC 1997 

Asn Arg Gly Gly Pro Gly Ala Gly Pro Asp Glu Trp Ala Asp Phe Gly 
600 605 610 

TTC GAC CTG CCC GAC TGC AAG GCC CGC AAG CAG CCC ATC AAG GAG GAG 2 04 5 

Phe Asp Leu Pro Asp Cys Lys Ala Arg Lys Gin Pro He Lys Glu Glu 
615 620 625 630 

TTC ACG GAG GCC GAG ATC CAC TGAGGGGCCG GGCCCAGCCA GAGCCTGTGC 2096 

Phe Thr Glu Ala Glu He His 
63S 

CACCGCCCAG AGACC CAGGC CGCCTCGCTC TCCTTCCTGT GTCCAAAACT GCCTCCGGAG 215 6 
GCAGGGCCTC CAGGC TGTGC CCGGGGAAAG GCAAGGTCCG GCCCATGCCC CGGCACCTCA 2216 



53 



CCGGCCCCAG GAGAGGCCCA GCCACCAAAG CCGCCTGCGG ACAGCCTGAG TCACCTGCAG 22 76 

AACCTTCTGG AGCTGCCCTA ATGCTGGGCT X GC GGGGC AG GGGCCGGCCC ACTCTCAGCC 2336 

CTGCCACTGC CGGGCGTGCT CCATGGCAGG CGTGGGTGGG GACCGCAGTG TCAGCTCCGA 2396 

CCTCCAGGCC TCATCCTAGA GACTCTGTCA TCTGCCGATC AAGCAAGGTC CTTCCAGAGG 24S6 

AAAGAATCCT CTTCGCTGGT GGACTGCCAA AAAGTATTTT GCGACATCTT TTGGTTCTGG 2516 

AGAGTGGTGA GCAGCCAAGC GACTGTGTCT GAAACACCGT GCATTTTCAG GGAATGTCCC 2576 

TAACGGGCTG GGGACTCTCT CTGCTGGACT TGGGAGTGGC CTTTGCCCCC AGCACACTGT 2636 

ATTCTGCGGG ACCGCCTCCT TCCTGCCCCT AACAACCACC AAAGTGTTGC TGAAATTGGA 2 6 96 

GAAAACTGGG GAAGGCGCAA CCCCTCCCAG GTGCGGGAAG CATCTGGTAC CGCCTCGGCC 2756 

AGTGCCCCTC AGCCTGGCCA CAGTCACCTC TCCTTGGGGA ACCCTGGGCA GAAAGGGACA 2316 

GCCTGTCCTT AGAGGACCGG AAATTGTCAA TATTTGATAA AATGATACCC TTTTCTAC 29 74 

(2) INFORMATION FOR SEQ ID NO: 2: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 637 aituno acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(n) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ala Gin Ser Thr Thr Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu 

15 10 15 

His Leu Trp Ser Ser Leu Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro 
20 25 30 

Gin Ser Ser Arg Gly Asn Asn Glu Val Val Gly Gly Thr Asp Ser Ser 
35 40 45 

Met Asp Val Phe His Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin 
50 55 60 

Phe Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala 

65 70 75 80 

Ser Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His 

85 90 95 



Val Thr Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr 
130 135 140 



Arg Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser 
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Tyr Val Asp Asd Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr 
225 230 235 240 



Met Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu 
260 265 270 

He He lie Thr Leu Glu Thr Arg Asp Gly Gin Val Leu Gly Arg Arg 

275 280 285 

Ser Phe Glu Gly Arg lie Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala 

290 295 300 

Asp Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala 

305 310 315 320 

Lys Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala 
325 330 335 

Val Pro Ala Leu Gly Pro Gly Val Lys Lys Arg Arg His Gly Asp Glu 
340 345 350 



Met Lys Leu Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro 

370 375 380 

Leu Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 
335 390 395 400 

Kis Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Ly3 

405 410 415 

Val His Gly Gly Val Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly 
420 425 430 

Gin Pro Pro Pro His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val 

435 440 445 

Gly Ser Gly Met Leu Asn Asn His Gly His Ala Val Pro Ala Asn Ser 
450 455 460 



Gly Leu Gin Ser He Tyr His Leu Gin Asn Leu Thr He Glu Asp Leu 

515 520 525 

Gly Ala Leu Lys He Pro Glu Gin Tyr Arg Met Thr He Trp Arg Gly 
530 535 540 



Glu Leu Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg 



580 



535 



590 



His Thr He Thr He Pro Asn Arg Gly Gly Pro Gly Ala Gly Pro Asp 
595 600 605 

Glu Trp Ala Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ala Arg Lys 

510 615 620 

Gin Pro He Lys Glu Glu Phe Thr Glu Ala Glu He His 
625 630 635 

(2) INFORMATION FOR SEQ ID NO: 3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2034 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Cebus apella 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 156.. 1652 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGCCTCCCCG CCCGCGCACC CGCCCCGAGG CCTGTGCTCC TGCGAAGGGG ACGCAGCGAA 60 
GCCGGGGCCC GCGCCAGGCC GGCCGGGACG GACGCCGATG CCCGGAGCTG CGACGGCTGC 12 0 
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ACC TCC CCC GAT GGG GGC ACC ACG TTT GAG CAC CTC TGG AGC TCT CTG 

Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu His Leu Trp Ser Ser Leu 

10 15 20 

GAA CCA GAC AGC ACC TAC TTC GAC CTT CCC CAG TCA AGC CGG GGG AAT 

Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro Gin Ser Ser Arg Gly Asn 

25 30 35 

AAT GAG GTG GTG GGT GGC ACG GAT TCC AGC ATG GAC GTC TTC CAC CTA 

Asn Glu Val Val Gly Gly Thr Asp Ser Ser Met Asp Val Phe His Leu 

40 45 50 

GAG GGC ATG ACC ACA TCT GTC ATG GCC CAG TTC AAT TTG CTG AGC AGC 

Glu Gly Met Thr Thr Ser Val Met Ala Gin Phe Asn Leu Leu Ser Ser 

55 60 65 70 

ACC ATG GAC CAG ATG AGC AGC CGC GCT GCC TCG GCC AGC CCG TAC ACC 

Thr Met A3p Gin Met Ser Ser Arg Ala Ala Ser Ala Ser Pro Tyr Thr 

75 80 85 

CCG GAG CAC GCC GCC AGC GTG CCC ACC CAT TCA CCC TAC GCA CAG CCC 

Pro Glu His Ala Ala Ser Val Pro Thr His Ser Pro Tyr Ala Gin Pro 

90 95 100 

AGC TCC ACC TTC GAC ACC ATG TCG CCC GCG CCT GTC ATC CCC TCC AAC 

Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro Val He Pro Ser Asn 

105 110 115 



ACC GAC TAT CCC GGA CCC CAC CAC TTC GAG GTC ACT TTC CAG CAG TCC 
Thr Asp Tyr Pro Gly Pro His His Phe Glu Val Thr Phe Gin Gin Ser 
120 125 130 



AGC ACG GCC AAG TCA GCC ACC TGG ACG TAC TCC CCA CTC TTG AAG AAA 



605 
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Ser Thr Ala Lys Ser Ala Thr Tru Thr Tyr Ser Pro Leu Leu Lys Lys 

135 140 145 150 

CTC TAC TGC CAG ATC GCC AAG ACA TGC CCC ATC CAG ATC AAG GTG TCC 653 
Leu Tyr Cys Gin lie Ala Lys Thr Cys Pro lie Gin lie Lys Val Ser 
155 160 165 

GCC CCA CCG CCC CCG GGC ACC GCC ATC CGG GCC ATG CCT GTC TAC AAG 701 
Ala Pro Pro Pro Pro Gly Thr Ala lie Arg Ala Met Pro Val Tyr Lys 
170 175 180 

AAG GCG GAG CAC GTG ACC GAC ATC GTG AAG CGC TGC CCC AAC CAC GAG 749 
Lys Ala Glu His Val Thr Asp lie Val Lys Arg Cys Pro Asn His Glu 
185 190 195 

CTC GGG AGG GAC TTC AAC GAA GGA CAG TCT GCC CCA GCC AGC CAC CTC 7 97 

Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro Ala Ser His Leu 
200 205 210 

ATC CGT GTG GAA GGC AAT AAT CTC TCG CAG TAT GTG GAC GAC CCT GTC 8 45 

He Arg Val Glu Gly Asn Asn Leu Ser Gin Tyr Val Asp Asp Pro Val 

215 220 225 230 

ACC GGC AGG CAG AGC GTC GTG GTG CCC TAT GAG CCA CCA CAG GTG GGG 8 93 

Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro Pro Gin Val Gly 
235 240 245 

ACA GAA TTC ACC ACC ATC CTG TAC AAC TTC ATG TGT AAC AGC AGC TGT 941 
Thr Glu Phe Thr Thr He Leu Tyr Asn Phe Met Cys Asn Ser Ser Cys 
250 255 260 

GTG GGG GGC ATG AAC CGA CGG CCC ATC CTC ATC ATC ATC ACC CTG GAG 9 39 

Val Gly Gly Met Asn Arg Arg Pro He Leu He He He Thr Leu Glu 
265 270 275 

ACG CGG GAT GGG CAG GTG CTG GGC CGC CGG TCC TTC GAG GGC CGC ATC 1C37 
Thr Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe Glu Gly Arg He 
280 285 290 

TGC GCC TGT CCT GGC CGC GAC CGA AAA GCC GAT GAG GAC CAC TAC CGG 10 8 5 

Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu Asp His Tyr Arg 

295 300 305 310 

GAG CAG CAG GCC TTG AAT GAG AGC TCC GCC AAG AAC GGG GCT GCC AGC 1133 
Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys Asn Gly Ala Ala Ser 
315 320 325 

AAG CGC GCC TTC AAG CAG AGT CCC CCT GCC GTC CCC GCC CTG GGC CCG 1181 
Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala Val Pro Ala Leu Gly Pro 
330 335 340 

GGT GTG AAG AAG CGG CGG CAC GGA GAC GAG GAC ACG TAC TAC CTG CAG 12 2 9 

Gly Val Lys Lys Arg Axg His Gly Asp Glu Asp Thr Tyr Tyr Leu Gin 
345 350 355 

GTG CGA GGC CGC GAG AAC TTC GAG ATC CTG ATG AAG CTG AAG GAG AGC 12 77 

Val Arg Gly Axg Glu Asn Phe Glu He Leu Met Lys Leu Lys Glu Ser 

360 365 370 

CTG GAG CTG ATG GAG TTG GTG CCG CAG CCG CTG GTA GAC TCC TAT CGG 1325 
Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val Asp Ser Tyr Arg 
375 380 385 390 

CAG CAG CAG CAG CTC CTA CAG AGG CCG AGT CAC CTA CAG CCC CCA TCC 1373 
Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser His Leu Gin Pro Pro Ser 

395 400 405 

TAC GGG CCG GTC CTC TCG CCC ATG AAC AAG GTG CAC GGG GGC GTG AAC 1421 
Tyr Gly Pro Val Leu Ser Pro Met Asn Lys Val His Gly Gly Val Asn 
410 415 420 

AAG CTG CCC TCC GTC AAC CAG CTG GTG GGC CAG CCT CCC CCG CAC AGC 14 69 



57 



Gly Gin Pro Pro Pro His Ser 
435 

TCG GCA GCT ACA CCC AAC C7G GGA CCT GTG GGC TCT GGG ATG CTC AAC 1517 
Ser Ala Ala Thr Pro Asn Leu Gly Pro Val Gly Ser Gly Met Leu Asn 
440 445 450 

AAC CAC GGC CAC GCA GTG CCA GCC AAC AGC GAG ATG ACC AGC AGC CAC 15 65 

Asn His Gly His Ala Val Pro Ala Asn Ser Glu Met Thr Ser Ser His 
455 460 465 470 

GGC ACC CAG TCC ATG GTC TCG GGG TCC CAC TGC ACT CCG CCA CCC CCC 1613 
Gly Thr Gin Ser Met Val Ser Gly Ser His Cys Thr Pro Pro Pro Pro 
475 480 435 

TAC CAC GCC GAC CCC AGC CTC GTC AGG ACC TGG GGG CCC TGAAGATCCC 1662 
Tyr His Ala Asp Pro Ser Leu Val Arg Thr Trp Gly Pro 
490 495 

CGAGCAGTAT CGCATGACCA TCTGGCGGGG CCTGCAGGAC CTGAAGCAGG GCCACGACTA 1722 

CGGCGCCGCC GCGCAGCAGC TGCTCCGCTC CAG-CAACGCG GCCGCCATTT CCATCGGCGG 1782 

CTCCGGGGAG CTGCAGCGCC AGCGGGTCAT GGAGGCCGTG CACTTCCGCG TGCGCCACAC 1342 

CATCACCATC CCCAACCGCG GCGGCCCCGG CGCCGGCCCC GAC GAGT GGG CGGACTTCGG 1902 

CTTCGACCTG CCCGACTGCA AGGCCCGCAA GCAGCCCATC AAGGAGGAGT TCACGGAGGC 1962 

CGAGATCCAC TGAGGGGCCG GGCCCAGCCA GAGCCTGTGC CACCGCCCAG AGACCCAGGC 2 02 2 

CGCCTCGCTC TC 2034 

(2) INFORMATION FOR SEQ ID NO: 4: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 499 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ala Gin Ser Thr Thr Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu 
15 10 15 

His Leu Trp Ser Ser Leu Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro 
20 25 30 

Gin Ser Ser Arg Gly Asn Asn Glu Val Val Gly Gly Thr Asp Ser Ser 
35 40 45 

Met Asp Val Phe Hi3 Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin 
50 55 60 

Phe Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala 

65 70 75 80 

Ser Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His 
85 90 95 



Val Thr Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr 
130 135 140 
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Gin He Lys Val Ser Ala Pro Pro Pro Pro Gly ' 
165 170 



. Asn Leu Ser Gin 



Val Val 
Leu Tyr , 



Asp Axg 
i Glu Ser 
Ser Pro 



He Leu 
Arg Arg 
Lys Ala 



335 
• Asp Glu 



Leu Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu 

385 390 395 



His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser 
405 410 



Gly Ser Gly Met Leu Asn Asn His Gly His Ala Val 
450 455 460 



Val Pro 

Gin Arg 

Pro Met 

Gin Leu 
430 

Leu Gly 
445 

Pro Ala , 
Ser Gly : 
Leu Val J 



Asn Lys 
415 

Val Gly 
Pro Val 



(2) INFORMATION FOR SEQ ID NO: 5: 
(l) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2156 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

ill) MOLECULE TYPE: cDNA 

:-/i) OPIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

( 8) LOCATION: 33.. 1940 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



TCC CCT GAT GGG GGC ACC ACG TTT GAG CAC CTC TGG AGC TCT CTG GAA 
Ser Pro Asp Gly Gly Thr Thr Phe Glu His Leu Trp Ser Ser Leu Glu 

10 15 20 

CCA GAC AGC ACC TAC TTC GAC CTT CCC CAG TCA AGC CG-G GGG AAT AAT 
Pro Asp Ser Thr Tyr Phe Asp Leu Pro Gin Ser Ser Axg Gly Asn Asn 
25 30 35 

GAG GTG GTG GGC GGA ACG GAT TCC AGC ATG GAC GTC TTC CAC CTG GAG 
Glu Val Val Gly Gly Thr Asp Ser Ser Met Asp Val Phe His Leu Glu 
40 45 50 55 

GGC ATG ACT ACA TCT GTC ATG GCC CAG TTC AAT CTG CTG AGC AGC ACC 
Gly Met Thr Thr Ser Val Met Ala Gin Phe Asn Leu Leu Ser Ser Thr 

60 65 70 

ATG GAC CAG ATG AGC AGC CGC GCG GCC TCG GCC AGC CCC TAC ACC CCA 
Met Asp Gin Met Ser Ser Arg Ala Ala Ser Ala Ser Pro Tyr Thr Pro 

75 80 85 

GAG CAC GCC GCC AGC GTG CCC ACC CAC TCG CCC TAC GCA CAA CCC AGC 
Glu His Ala Ala Ser Val Pro Thr His Ser Pro Tyr Ala Gin Pro Ser 

90 95 100 

TCC ACC TTC GAC ACC ATG TCG CCG GCG CCT GTC ATC CCC TCC AAC ACC 
Ser Thr Phe Asp Thr Met Ser Pro Ala Pro Val He Pro Ser Asn Thr 
105 110 115 

GAC TAC CCC GGA CCC CAC CAC TTT GAG GTC ACT TTC CAG CAG TCC AGC 
Asp Tyr Pro Gly Pro His His Phe Glu Val Thr Phe Gin Gin Ser Ser 
120 125 130 135 

ACG GCC AAG TCA GCC ACC TGG ACG TAC TCC CCG CTC TTG AAG AAA CTC 
Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser Pro Leu Leu Lys Lys Leu 
140 145 ISO 

TAC TGC CAG ATC GCC AAG ACA TGC CCC ATC CAG ATC AAG GTG TCC ACC 
Tyr Cys Gin He Ala Lys Thr Cys Pro He Gin He Lys Val Ser Thr 
155 160 165 

CCG CCA CCC CCA GGC ACT GCC ATC CGG GCC ATG CCT GTT TAC AAG AAA 
Pro Pro Pro Pro Gly Thr Ala He Arg Ala Met Pro Val Tyr Lys Lys 

170 175 180 

GCG GAG CAC GTG ACC GAC GTC GTG AAA CGC TGC CCC AAC CAC GAG CTC 
Ala Glu His Val Thr Asp Val Val Lys Arg Cys Pro Asn His Glu Leu 
135 190 195 

GGG AGG GAC TTC AAC GAA GGA CAG TCT GCT CCA GCC AGC CAC CTC ATC 
Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro Ala Ser His Leu He 

200 205 210 215 
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CGC GTG GAA GGC AAT AAT CTC TCG CAG TAT GTG GAT GAC CCT GTC ACC 7 25 

Arg Val Glu Gly Asn Asn Leu Ser Gin Tyr Val Asp Asp Pro Val Thr 
220 225 230 

GGC AGG CAG AGC GTC GTG GTG CCC TAT GAG CCA CCA CAG GTG GGG ACG 773 
Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro Pro Gin Val Gly Thr 
235 240 245 

GAA TTC ACC ACC ATC CTG TAC AAC TTC ATG TGT AAC AGC AGC TGT GTA 821 
Glu Phe Thr Thr He Leu Tyr Asn Phe Met Cys Asn Ser Ser Cys Val 
250 255 260 

GGG GGC ATG AAC CGG CGG CCC ATC CTC ATC ATC ATC ACC CTG GAG ATG 3 69 

Gly Gly Met Asn Arg Arg Pro lie Leu He lie He Thr Leu Glu Met 
265 270 275 

CGG GAT GGG CAG GTG CTG GGC CGC CGG TCC TTT GAG GGC CGC ATC TGC 917 
Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe Glu Gly Arg He Cys 

280 295 290 295 

GCC TGT CCT GGC CGC GAC CGA AAA GCT GAT GAG GAC CAC TAC CGG GAG 965 
Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu Asp His Tyr Arg Glu 
300 305 310 

CAG CAG GCC CTG AAC GAG AGC TCC GCC AAG AAC GGG GCC GCC AGC AAG 1013 
Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys Asn Gly Ala Ala Ser Lys 
315 320 325 

CGT GCC TTC AAG CAG AGC CCC CCT GCC GTC CCC GCC CTT GGT GCC GGT 10 61 

Arg Ala Phe Lys Gin Ser Pro Pro Ala Val Pro Ala Leu Gly Ala Gly 
330 335 340 

GTG AAG AAG CGG CGG CAT GGA GAC GAG GAC ACG TAC TAC CTT CAG GTG 1109 
Val Lys Lys Arg Arg His Gly Asp Glu Asp Thr Tyr Tyr Leu Gin Val 
345 350 355 

CGA GGC CGG GAG AAC TTT GAG ATC CTG ATG AAG CTG AAA GAG AGC CTG 1157 
Arg Gly Arg Glu Asn Phe Glu He Leu Met Lys Leu Lys Glu Ser Leu 
360 365 370 375 

GAG CTG ATG GAG TTG GTG CCG CAG CCA CTG GTG GAC TCC TAT CGG CAG 12 05 

Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val Asp Ser Tyr Arg Gin 
380 385 390 

CAG CAG CAG CTC CTA CAG AGG CCG AGT CAC CTA CAG CCC CCG TCC TAC 12 5 3 

Gin Gin Gin Leu Leu Gin Arg Pro Ser His Leu Gin Pro Pro Ser Tyr 
395 400 405 

GGG CCG GTC CTC TCG CCC ATG AAC AAG GTG CAC GGG GGC ATG AAC AAG 13 01 

Gly Pro Val Leu Ser Pro Met Asn Lys Val His Gly Gly Met Asn Lys 
410 415 420 

CTG CCC TCC GTC AAC CAG CTG GTG GGC CAG CCT CCC CCG CAC AGT TCG 13 4 9 

Leu Pro Ser Val Asn Gin Leu Val Gly Gin Pro Pro Pro His Ser Ser 
425 430 435 

GCA GCT ACA CCC AAC CTG GGG CCC GTG GGC CCC GGG ATG CTC AAC AAC 1397 
Ala Ala Thr Pro Asn Leu Gly Pro Val Gly Pro Gly Met Leu Asn Asn 
440 445 450 455 

CAT GGC CAC GCA GTG CCA GCC AAC GGC GAG ATG AGC AGC AGC CAC AGC 14 45 

His Gly His Ala Val Pro Ala Asn Gly Glu Met Ser Ser Ser His Ser 
460 465 470 

GCC CAG TCC ATG GTC TCG GGG TCC CAC TGC ACT CCG CCA CCC CCC TAC 14 93 

Ala Gin Ser Met Val Ser Gly Ser His Cys Thr Pro Pro Pro Pro Tyr 
475 480 485 

CAC GCC GAC CCC AGC CTC GTC AGT TTT TTA ACA GGA TTG GGG TGT CCA 15 41 

His Ala Asp Pro Ser Leu Val Ser Phe Leu Thr Gly Leu Gly Cys Pro 
490 495 500 
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AAC TGC ATC GAG TAT TTC ACC TCC CAA GGG TTA CAG AGC ATT TAC CAC 15 3 9 

Asn Cys lie GIu Tyr Phe Thr Ser Gin Gly Leu Gin Ser lie Tyr His 
505 510 515 

CTG CAG AAC CTG ACC ATT GAG GAC CTG GGG GCC CTG AAG ATC CCC GAG 1637 
Leu Gin Asn Leu Thr lie Glu Asp Leu Gly Ala Leu Lys He Pro Glu 
520 525 530 535 

CAG TAC CGC ATG ACC ATC TGG CGG GGC CTG CAG GAC CTG AAG CAG GGC 163 5 

Gin Tyr Arg Met Thr lie Trp Arg Gly Leu Gin Asp Leu Lys Gin Gly 
540 545 5S0 

CAC GAC TAC AGC ACC GCG CAG CAG CTG CTC CGC TCT AGC AAC GCG GCC 17 33 

His Asp Tyr Ser Thr Ala Gin Gin Leu Leu Arg Ser Ser Asn Ala Ala 
555 560 565 

ACC ATC TCC ATC GGC GGC TCA GGG GAA CTG CAG CGC CAG CGG GTC ATG 17 81 

Thr He Ser He Gly Gly Ser Gly Glu Leu Gin Arg Gin Arg Val Met 
570 575 580 

GAG GCC GTG CAC TTC CGC GTG CGC CAC ACC ATC ACC ATC CCC AAC CGC 1829 
Glu Ala Val His Phe Arg Val Arg His Thr He Thr lie Pro Asn Arg 
585 590 595 

GGC GGC CCA GGC GGC GGC CCT GAC GAG TGG GCG GAC TTC GGC TTC GAC 18 77 

Gly Gly Pro Gly Gly Gly Pro Asp Glu Trp Ala Asp Phe Gly Phe Asp 
600 605 610 615 

CTG CCC GAC TGC AAG GCC CGC AAG CAG CCC ATC AAG GAG GAG TTC ACG 1925 
Leu Pro Asp Cys Lys Ala Arg Lys Gin Pro He Lys Glu Glu Phe Thr 
620 625 630 

GAG GCC GAG ATC CAC TGAGGGCCTC GCCTGGCTGC AGCCTGCGCC ACCGCCCAGA 198 0 
Glu Ala Glu He His 
63S 

GACCCAAGCT GCCTCCCCTC TCCTTCCTGT GTGTCCAAAA CTGCCTCAGG AGGCAGGACC 20 4 0 

TTCGGGCTGT GCCCGGGGAA AGGCAAGGTC CGGCCCATCC CCAGGCACCT CACAGGCCCC 2100 

AGGAAAGGCC CAGCCACCGA AGCCGCCTGT GGACAGCCTG AGTCACCTGC AGAACC 2156 



(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 636 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Gin Ser Thr Ala Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu 

15 10 15 

His Leu Trp Ser Ser Leu Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro 
20 25 30 

Gin Ser Ser Arg Gly Asn Asn Glu Val Val Gly Gly Thr Asp Ser Ser 
35 40 45 

Met Asp Val Phe His Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin 

50 55 60 

Phe Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala 

65 70 75 SO 

ser Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His 
85 90 95 
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Ser Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser =>>-o Ala 

100 105 no 

Pro Val lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu 
115 120 125 



Ser Pro Leu Leu Lys Lys Leu Tyr Cys Gin lie Ala Lys Thr Cys Pro 

145 150 155 160 

lie Gin lie Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala lie Arg 

165 170 175 



Arg Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser 
195 200 205 



Tyr Val Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr 

225 230 235 240 

Glu Pro Pro Gin Val Gly Thr Glu Phe Thr Thr lie Leu Tyr Asn Phe 
245 250 255 



He lie lie Thr Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg 
275 280 285 



Asp Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala 

305 310 315 320 

Lys Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala 

325 330 335 

Val Pro Ala Leu Gly Ala Gly Val Lys Lys Arg Arg His Gly Asd Glu 
340 345 350 

Asp Thr Tyr Tyr Leu Gin Val Arg Gly Arg Glu Asn Phe Glu He Leu 
355 360 365 



Leu Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 

385 390 395 400 

His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys 
405 410 415 

Val His Gly Gly Met Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly 
420 425 430 

Gin Pro Pro Pro His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val 
435 440 445 

Gly Pro Gly Met Leu Asn Asn His Gly His Ala Val Pro Ala Asn Gly 
450 455 460 

Glu Met Ser Ser Ser His Ser Ala Gin Ser Met Val Ser Gly Ser His 
4f 55 470 475 480 
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Cys Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser Phe 
485 490 495 

Leu Thr Gly Leu Gly Cys Pro Asn Cys lie Glu Tyr Phe Thr Ser Gin 
500 505 510 

Gly Leu Gin Ser lie Tyr His Leu Gin Asn Leu Thr lie Glu Asp Leu 
515 520 525 

Gly Ala. Leu Lys lie Pro Glu Gin Tyr Arg Met Thr He Trp Arg Gly 
530 535 540 

Leu Gin Asp Leu Lys Gin Gly His Asp Tyr Ser Thr Ala Gin Gin Leu 
545 550 555 560 

Leu Arg Ser Ser Asn Ala Ala Thr lie Ser He Gly Gly Ser Gly Glu 
565 570 575 

Leu Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His 

530 585 590 

Thr He Thr He Pro Asn Arg Gly Gly Pro Gly Gly Gly Pro Asp Glu 
595 600 605 

Trp Ala Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ala Arg Lys Gin 

610 615 620 

Pro He Lys Glu Glu Phe Thr Glu Ala Glu He His 
625 630 635 

(2} INFORMATION FOR SEQ ID NO: 7: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2040 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(IX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 124.. 1890 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TGATCTCCCT GTGGCCTGCA GGGGACTGAG C C AGGGAGT A GATGCCCTGA GACCCCAAGG 60 

GACACCCAAG GAAACCTTGC TGGCTTTGAG AAAGGGATCG TCTCTCTCCT GCCCAAGAGA 120 

AGC ATG TGT ATG GGC CCT GTG TAT GAA TCC TTG GGG CAG GCC CAG TTC 153 
Met Cys Met Gly Pro Val Tyr Glu Ser Leu Gly Gin Ala Gin Phe 
15 10 15 

AAT TTG CTC AGC AGT GCC ATG GAC CAG ATG GGC AGC CGT GCG GCC CCG 216 
Asn Leu Leu Ser Ser Ala Met Asp Gin Met Gly Ser Arg Ala Ala Pro 

20 25 30 

GCG AGC CCC TAC ACC CCG GAG CAC GCC GCC AGC GCG CCC ACC CAC TCG 2 64 

Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Ala Pro Thr His Ser 
35 40 45 

CCC TAC GCG CAG CCC AGC TCC ACC TTC GAC ACC ATG TCT CCG GCG CCT 312 
Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro 
50 55 60 



GTC ATC CCT TCC AAT ACC GAC TAC CCC GGC CCC CAC CAC TTC GAG GTC 



360 
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Val He Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val 

65 70 75 

ACC TTC CAG CAG TCG AGC ACT GCC AAG TCG GCC ACC TGG AC A TAC TCC 
Thr Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser 

30 35 90 95 

CCA Z~Z TTG AAG AAG TTG TAC TGT CAG ATT GCT AAG ACA TGC CCC ATC 
Pro Leu Leu Lys Lys Leu Tyr Cys Gin lie Ala Lys Thr Cys Pro He 

100 105 HO 

CAG ATC AAA GTG TCC ACA CCA CCA CCC CCG GGC ACG GCC ATC CGG GCC 
Gin He Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala He Arg Ala 
115 120 125 

ATG CCT GTC TAC AAG AAG GCA GAG CAT GTG ACC GAC ATT GTT AAG CGC 
Met Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp He Val Lys Arg 
130 135 140 

TGC CCC AAC CAC GAG CTT GGA AGG GAC TTC AAT GAA GGA CAG TCT GCC 
Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala 
145 150 155 

CCG GCT AGC CAC CTC ATC CGT GTA GAA GGC AAC AAC CTC GCC CAG TAC 
Pro Ala Ser His Leu He Arg Val Glu Gly Asn Asn Leu Ala Gin Tyr 

160 165 170 175 

GTG GAT GAC CCT GTC ACC GGA AGG CAG AGT GTG GTT GTG CCG TAT GAA 
Val Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu 
180 185 190 

CCC CCA CAG GTG GGA ACA GAA TTT ACC ACC ATC CTG TAC AAC TTC ATG 
Pro Pro Gin Val Gly Thr Glu Phe Thr Thr He Leu Tyr Asn Phe Met 
195 200 205 

TGT AAC AGC AGC TGT GTG GGG GGC ATG AAT CGG AGG CCC ATC CTT GTC 
Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu Val 
210 215 220 

ATC ATC ACC CTG GAG ACC CGG GAT GGA CAG GTC CTG GGC CGC CGG TCT 
He He Thr Leu Glu Thr Arg Asp Gly Gin Val Leu Gly Arg Arg Ser 
225 230 235 

TTC GAG GGT CGC ATC TGT GCC TGT CCT GGC CGT GAC CGC AAA GCT GAT 
Phe Glu Gly Arg He Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp 
240 245 250 255 

GAA GAC CAT TAC CGG GAG CAA CAG GCT CTG AAT GAA AGT ACC ACC AAA 
Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Thr Thr Lys 

260 265 270 

AAT GGA GCT GCC AGC AAA CGT GCA TTC AAG CAG AGC CCC CCT GCC ATC 
Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala He 
275 280 285 

CCT GCC CTG GGT ACC AAC GTG AAG AAG AGA CGC CAC GGG GAC GAG GAC 
Pro Ala Leu Gly Thr Asn Val Lys Lys Arg Arg His Gly Asp Glu Asp 
290 295 300 

ATG TTC TAC ATG CAC GTG CGA GGC CGG GAG AAC TTT GAG ATC TTG ATG 
Met Phe Tyr Met His Val Arg Gly Arg Glu Asn Phe Glu He Leu Met 
305 310 315 

AAA GTC AAG GAG AGC CTA GAA CTG ATG GAG CTT GTG CCC CAG CCT TTG 
Lys Val Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu 
320 325 330 335 

GTT GAC TCC TAT CGA CAG CAG CAG CAG CAG CAG CTC CTA CAG AGG CCG 
Val Asp Ser Tyr Arg Gin Gin Gin Gin Gin Gin Leu Leu Gin Arg Pro 
340 345 350 

AGT CAC CTG CAG CCT CCA TCC TAT GGG CCC GTG CTC TCC CCA ATG AAC 
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Ser His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn 

3S5 360 365 

AAG GTA CAC GGT GGT GTC AAC AAA CTG CCC TCC GTC AAC CAG CTG GTG 
Lys Val His Gly Gly Val Asn Lys Leu Pro Ser Val Asn Gin Leu Val 
370 375 380 



GGC CAG CCT CCC CCG CAC AGC TCA GCA GCT GGG CCC AAC CTG GGG CCC 
Gly Gin Pro Pro Pro His Ser Ser Ala Ala Gly Pro Asn Leu Gly Pro 
385 390 395 

ATG GGC TCC GGG ATG CTC AAC AGC CAC GGC CAC AGC ATG CCG GCC AAT 
Met Gly Ser Gly Met Leu Asn Ser His Gly His Ser Met Pro Ala Asn 
400 405 410 415 

GGT GAG ATG AAT GGA GGC CAC AGC TCC CAG ACC ATG GTT TCG GGA TCC 
Gly Glu Met Asn Gly Gly His Ser Ser Gin Thr Met Val Ser Gly Ser 
420 425 430 

CAC TGC ACC CCG CCA CCC CCC TAT CAT GCA GAC CCC AGC CTC GTC AGT 
His Cys Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser 
435 440 445 

TTT TTG ACA GGG TTG GGG TGT CCA AAC TGC ATC GAG TGC TTC ACT TCC 
?he Leu Thr Gly Leu Gly Cys Pro Asn Cys He Glu Cys Phe Thr Ser 
450 455 460 

CAA GGG TTG CAG AGC ATC TAC CAC CTG CAG AAC CTT ACC ATC GAG GAC 
Gin Gly Leu Gin Ser lie Tyr His Leu Gin Asn Leu Thr lie Glu Asp 
465 470 47S 



CTT GGG GCT CTG AAG GTC CCT GAC CAG TAC CGT ATG ACC ATC TGG AGG 1608 
Leu Gly Ala Leu Lys Val Pro Asp Gin Tyr Arg Met Thr He Trp Arg 
480 485 490 495 

GGC CTA CAG GAC CTG AAG CAG AGC CAT GAC TGC GGC CAG CAA CTG CTA 165 6 

Gly Leu Gin Asp Leu Lys Gin Ser His Asp Cys Gly Gin Gin Leu Leu 

500 505 510 

CGC TCC AGC AGC AAC GCG GCC ACC ATC TCC ATC GGC GGC TCT GGC GAG 170 4 

Arg Ser Ser Ser Asn Ala Ala Thr He Ser He Gly Gly Ser Gly Glu 
515 520 525 

CTG CAG CGG CAG CGG GTC ATG GAA GCC GTG CAT TTC CGT GTG CGC CAC 1752 
Leu Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His 
530 535 540 

ACC ATC ACA ATC CCC AAC CGT GGA GGC GCA GGT GCG GTG ACA GGT CCC 1300 
Thr He Thr He Pro Asn Arg Gly Gly Ala Gly Ala Val Thr Gly Pro 
545 550 555 

GAC GAG TGG GCG GAC TTT GGC TTT GAC CTG CCT GAC TGC AAG TCC CGT 13 4 8 

Asp Glu Trp Ala Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ser Arg 

560 565 570 575 

AAG CAG CCC ATC AAA GAG GAG TTC ACA GAG ACA GAG AGC CAC 1890 
Lys Gin Pro He Lys Glu Glu Phe Thr Glu Thr Glu Ser His 
580 585 

TGAGGAAC GT ACCTTCTTCT CCTGTCCTTC CTCTGTGAGA AACTGCTCTT GGAAGT GGGA 195 0 

CCTGTTGGCT GT GCC CAC AG AAACCAGCAA GGACCTTCTG CCGGATGCCA TTCCTGAAGG 2010 

GAAGTCGCTC AT GAAC T AAC TCCCTCTTGG 204 0 



(2) INFORMATION FOR SEQ ID NO: 8 : 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 589 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 8: 

Met Cys Met Gly Pro Val Tyr Glu Ser Leu Gly Gin Ala Gin Phe Asn 
5 10 15 

Leu Leu Ser Ser Ala Met Asp Gin Met Gly Ser Arg Ala Ala Pro Ala 

20 25 30 

Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Ala Pro Thr His Ser Pro 
35 40 45 

Tyr Ala Gin Pro ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro Val 

50 55 60 

lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val Thr 

55 70 75 80 

Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser Pro 

35 90 95 

Leu Leu Lys Lys Leu Tyr Cys Gin lie Ala Lys Thr Cys Pro lie Gin 

100 105 HO 

lie Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala lie Arg Ala Met 
115 120 125 

Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp He Val Lys Arg Cys 
130 135 140 

Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro 
145 150 155 160 

Ala Ser His Leu He Axg Val Glu Gly Asn Asn Leu Ala Gin Tyr Val 

165 170 175 

Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro 

180 185 190 

Pro Gin Val Gly Thr Glu Phe Thr Thr He Leu Tyr Asn Phe Met Cys 
195 200 205 

Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu Val He 
210 215 220 

He Thr Leu Glu Thr Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe 

225 230 235 240 

Glu Gly Arg He Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu 
245 250 255 

Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Thr Thr Lys Asn 

260 265 270 

Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala He Pro 

275 230 285 

Ala Leu Gly Thr Asn Val Ly3 Lys Arg Arg His Gly Asp Glu Asp Met 

290 295 300 

Phe Tyr Met His Val Arg Gly Arg Glu Asn Phe Glu He Leu Met Lys 
305 310 315 320 

Val Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val 
325 330 335 

Asp Ser Tyr Arg Gin Gin Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 
340 345 350 

His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys 



66 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Cys Met Gly Pro Val Tyr Glu Ser Leu Gly Gin Ala Gin Phe Asn 
1 5 10 15 

Leu Leu Ser Ser Ala Met Asp Gin Met Gly Ser Arg Ala Ala Pro Ala 
20 25 30 

Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Ala Pro Thr His Ser Pro 
35 40 45 

Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro Val 
50 55 60 

He Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val Thr 
65 70 75 80 

Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser Pro 
85 90 95 

Leu Leu Lys Lys Leu Tyr Cys Gin He Ala Lys Thr Cys Pro He Gin 

100 105 no 

He Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala He Axg Ala Met 
115 120 125 

Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp He Val Lys Arg Cys 

130 135 140 

Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro 
145 150 155 160 

Ala Ser His Leu He Axg Val Glu Gly Asn Asn Leu Ala Gin Tyr Val 

165 170 175 

Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro 
ISO 185 190 

Pro Gin Val Gly Thr Glu Phe Thr Thr lie Leu Tyr Asn Phe Met Cys 
195 200 205 

Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu Val He 
210 215 220 

He Thr Leu Glu Thr Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe 
225 230 235 240 

Glu Gly Arg He Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu 
245 250 255 

Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Thr Thr Lys Asn 
260 265 270 

Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala He Pro 
275 280 285 

Ala Leu Gly Thr Asn Val Lys Lys Arg Arg His Gly Asp Glu Asp Met 

290 295 _ 300 

Phe Tyr Met His Val Arg Gly Arg Glu Asn Phe Glu He Leu Met Lys 

305 310 315 320 

Val Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val 
325 330 335 

Asp Ser Tyr Arg Gin Gin Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 
340 345 350 

His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys 
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Val His Gly Gly Val Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly 
370 375 380 

Gin Pro Pro Pro His Ser Ser Ala Ala Gly Pro Asn Leu Gly Pro Met 
325 390 395 400 

Gly Ser Gly Met Leu Asn Ser His Gly His Ser Met Pro Ala Asn Gly 

405 410 415 

Glu Met Asn Gly Gly His Ser Ser Gin Thr Met Val Ser Gly Ser His 
420 425 430 

Cys Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser Phe 
435 440 445 

Leu Thr Gly Leu Gly Cys Pro Asn Cys He Glu Cys Phe Thr Ser Gin 
450 455 460 

Gly Leu Gin Ser He Tyr His Leu Gin Asn Leu Thr lie Glu Asp Leu 
465 470 475 480 

Gly Ala Leu Lys Val Pro Asp Gin Tyr Arg Met Thr He Trp Ara Glv 
485 490 495 

Leu Gin Asp Leu Lys Gin Ser His Asp Cys Gly Gin Gin Leu Leu Arg 

500 505 510 

Ser Ser Ser Asn Ala Ala Thr He Ser He Gly Gly Ser Gly Glu Leu 
515 520 525 

Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His Thr 
530 535 540 

He Thr He Pro Asn Arg Gly Gly Ala Gly Ala Val Thr Gly Pro Asp 
545 550 555 5 60 

Glu Trp Ala Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ser Arg Lys 
565 570 575 

Gin Pro He Lys Glu Glu Phe Thr Glu Thr Glu Ser His 
580 585 

(2) INFORMATION FOR SEQ ID NO: 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 758 base pairs 
(B> TYPE: nucleic acid 
(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mus musculus 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 389.. 757 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
TGGTCCCGCT TCGACCAAGA CTCCGGCTAC CAGCTTGCGG GCCCCGCGGA GGAGGAGAC C 
CCGCTGGGGC TAGCTGGGCG ACGCGCGCCA AGCGGCGGCG GGAAGGAGGC GGGAGGAGCG 
GGGCCCGAGA CCCCGACTCG GGCAGAGCCA GCTGGGGAGG CGGGGCGCGC GT GGGAGC C A 
GGGGCCCGGG TGGCCGGCCC TCCTCCGCCA CGGCTGAGTG CCCGCGCTGC CTTCCCGCCG 



120 
130 
240 
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GTCCGCCAAG AAAGGCGCTA AGCCTGCGGC AGTCCCCTCG CCGCCGCCTC CCTGCTCCGC 300 

ACCCTTATAA CCCGCCGTCC CGCATCCAGG C GAGGAGGC A ACGCTGCAGC CCAGCCCTCG 3 60 

CCGACGCCGA CGCCCGGCCC GGAGCAGA ATG AGC GGC AGC GTT GGG GAG ATG 412 
Met Ser Gly Ser Val Gly Glu Met 
1 5 

GCC CAG ACC TCT TCT TCC TCC TCC TCC ACC TTC GAG CAC CTG TGG AGT 4 60 

Ala Gin Thr Ser Ser Ser Ser Ser Ser Thr Phe Glu His Leu Trp Ser 

10 15 20 

TCT CTA GAG CCA GAC AGC ACC TAC TTT GAC CTC CCC CAG CCC AGC CAA 503 
Ser Leu Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro Gin Pro Ser Gin 
25 30 35 40 

GGG ACT AGC GAG GCA TCA GGC AGC GAG GAG TCC AAC ATG GAT GTC TTC 55 6 

Gly Thr Ser Glu Ala Ser Gly Ser Glu Glu Ser Asn Met Asp Val Phe 
45 50 55 

CAC CTG CAA GGC ATG GCC CAG TTC AAT TTG CTC AGC AGT GCC ATG GAC 604 
Hi 3 Leu Gin Gly Met Ala Gin Phe Asn Leu Leu Ser Ser Ala Met Asp 

60 65 70 

CAG ATG GGC AGC CGT GCG GCC CCG GCG AGC CCC TAC ACC CCG GAG CAC 652 
Gin Met Gly Ser Arg Ala Ala Pro Ala Ser Pro Tyr Thr Pro Glu His 
75 80 85 

GCC GCC AGC GCG CCC ACC CAC TCG CCC TAC GCG CAG CCC AGC TCC ACC 700 
Ala Ala Ser Ala Pro Thr His Ser Pro Tyr Ala Gin Pro Ser Ser Thr 
90 95 100 

TTC GAC ACC ATG TCT CCG GCG CCT GTC ATC CCT TCC AAT ACC GAC TAC 74 8 

Phe Asp Thr Met Ser Pro Ala Pro Val lie Pro Ser Asn Thr Asp Tyr 

105 no 115 120 

CCC GGC CCC C 758 
Pro Gly Pro 



(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 123 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Ser Gly Ser Val Gly Glu Met Ala Gin Thr Ser Ser Ser Ser Ser 

1 5 10 1S 

Ser Thr Phe Glu His Leu Trp Ser Ser Leu Glu Pro Asp Ser Thr Tyr 

20 25 30 

Phe Asp Leu Pro Gin Pro Ser Gin Gly Thr Ser Glu Ala Ser Gly Ser 
35 40 45 

Glu Glu Ser Asn Met Asp Val Phe His Leu Gin Gly Met Ala Gin Phe 
50 55 so 

Asn Leu Leu Ser Ser Ala Met Asp Gin Met Gly Ser Arg Ala Ala Pro 

65 70 75 30 

Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Ala Pro Thr His Ser 

85 90 95 

Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro 

100 105 no 
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Val lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro 
115 120 

'2) INFORMATION FOR SEQ ID NO: 11: 

il) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 559 base pairs 
(3) TYPE: nucleic acid 
(C) 5TRANDEDNESS: double 
CD) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
CGACCTTCCC CAGTCAAGCC GGGGGAATAA TGAGGTGGTG GGCGGAACGG ATTCCAGCAT 
GGACGTCTTC CACCTGGAGG GCATGACTAC ATCTGTCATG CATCCTCGGC TCCTGCCTCA 
CTAGCTGCGG AGCCTCTCCC GCTCGGTCCA CGCTGCCGGG CGGCCACGAC CGTGACCCTT 
CCCCTCGGGC CGCCCAGATC CATGCCTCGT CCCACGGGAC ACCAGTTCCC TGGCGTGTGC 
AGACCCCCCG GCGCCTACCA TGCTGTACGT CGGTGACCCC GCACGGCACC TCGCCACGGC 
CCAGTTCAAT CTGCTGAGCA GCACCATGGA CCAGATGAGC AGCCGCGCGG CCTCGGCCAG 
CCCCTACACC CCAGAGCACG CCGCCAGCGT GCCCACCCAC TCGCCCTACG CACAACCCAG 
CTCCACCTTC GACACCATGT CGCCGGCGCC TGTCATCCCC TCCAACACCG ACTACCCCGG 
ACCCCACCAC TTTGAGGTCA CTTTCCAGCA GTCCAGCACG GCCAAGTCAG CCACCTGGAC 
GTACTCCCCG CTCTTGAAG 
(2) INFORMATION FOR SEQ ID NO: 12: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1764 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ATGCTGTACG TCGGTGACCC CGCACGGCAC CTCGCCACGG CCCAGTTCAA TCTGCTGAGC 
AGC AC CAT GG AC C AGAT GAG CAGCCGCGCG GCCTCGGCCA GCCCCTACAC CCCAGAGCAC 
GCCGCCAGCG TGCCCACCCA CTCGCCCTAC GCACAACCCA GCTCCACCTT CGACACCATG 
TCGCCGGCGC CTGTCATCCC CTCCAACACC GACTACCCCG GACCCCACCA CTTTGAGGTC 
ACTTTCCAGC AGTCCAGCAC GGCCAAGTCA GCCACCTGGA CGTACTCCCC GCTCTTGAAG 
AAACTCTACT GCCAGATCGC CAAGACATGC CCCATCCAGA TCAAGGTGTC CACCCCGCCA 
CCCCCAGGCA CTGCCATCCG GGCCATGCCT GTTTACAAGA AAGCGGAGCA CGTGACCGAC 
GTCGTGAAAC GCTGCCCCAA CCACGAGCTC GGGAGGGACT TCAACGAAGG ACAGTCTGCT 



120 
180 
240 
300 
360 
420 
480 
540 
559 



120 
180 
240 
300 
360 
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CCAGCCAGCC ACCTCATCCG CGTGGAAGGC AATAATCTCT CGCAGTATGT GGATGACCCT 54 0 

GTCACCGGCA GGCAGAGCGT CGTGGTGCCC TATGAGCCAC CACAGGTGGG GACGGAATTC 60 0 

ACCACCATCC TGTACAACTT CATGTGTAAC AGCAGCTGTG TAGGGGGCAT GAACCGGCGG 660 

CCCATCCTCA TCATCATCAC CCTGGAGATG CGGGATGGGC AGGTGCTGGG CCGCCGGTCC 72 0 

TTTGAGGGCC GCATCTGCGC CTGTCCTGGC CGCGACCGAA AAGCTGATGA GGACCACTAC 7 80 

CGGGAGCAGC AGGCCCTGAA CGAGAGCTCC GCCAAGAACG GGGCCGCCAG CAAGCGTGCC 34 0 

TTCAAGCAGA GCCCCCCTGC CGTCCCCGCC CTTGGTGCCG GTGTGAAGAA GCGGCGGCAT 900 

GGAGAC GAGG ACACGTACTA CCTTCAGGTG CGAGGCCGGG AGAACTTTGA GATCCTGATG 9 60 

AAGCTGAAAG AGAGCCTGGA GCTGATGGAG TTGGTGCCGC AGCCACTGGT GGAC7CCTAT 102 0 

CGGCAGCAGC AGCAGCTCCT ACAGAGGCCG AGTCACCTAC AGCCCCCGTC CTACGGGCCG 1030 

GTCCTCTCGC CCATGAACAA GGTGCACGGG GGCATGAACA AGCTGCCCTC CGTCAACCAG 1140 

CTGGTGGGCC AGCCTCCCCC GCACAGTTCG GCAGCTACAC CCAACCTGGG GCCCGTGGGC 12 00 

CCCGGGATGC TCAACAACCA TGGCCACGCA GTGCCAGCCA AC GGC GAGAT GAGCAGCAGC 12 60 

CACAGCGCCC AGTCCATGGT CTCGGGGTCC CACTGCACTC CGCCACCCCC CTACCACGCC 1320 

GACCCCAGCC TCGTCAGTTT TTTAACAGGA TTGGGGTGTC CAAACTGCAT CGAGTATTTC 138 0 

ACCTCCCAAG GGTTACAGAG CATTTACCAC CTGCAGAACC TGACCATTGA GGACCTGGGG 14 40 

GCCCTGAAGA TCCCCGAGCA GTACCGCATG ACCATCTGGC GGGGCCTGCA GGACCTGAAG 1500 

CAGGGCCACG ACTACAGCAC CGCGCAGCAG CTGCTCCGCT CTAGCAACGC GGCCACCATC 15 60 

TCCATCGGCG GCTCAGGGGA AC TGCAGC GC CAGCGGGTCA TGGAGGCCGT GCACTTCCGC 162 0 

GTGCGCCACA CCATCACCAT CCCCAACCGC GGCGGCCCAG GCGGCGGCCC TGACGAGTGG 168 0 

GCGGACTTCG GCTTCGACCT GCCCGACTGC AAGGCCCGCA AGC AGC C CAT CAAGGAGGAG 174 0 

TTCACGGAGG CCGAGATCCA CTGA 17 g 4 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 87 anu.no acids 

(B) TYPE: amino acid 
<D> TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Leu Tyr Val Gly Asp Pro Ala Arg His Leu Ala Thr Ala Gin Phe 
15 10 15 

Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala Ser 

20 25 30 

Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His Ser 
35 40 45 

Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro 

SO 55 60 

Val lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val 
65 70 75 90 
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Thr ?he Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser 



Pro Leu Leu Lys Lys Leu Tyr Cys Gin He Ala Lys Thr Cys Pro lie 

100 105 110 

Gin lie Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala He Arg Ala 

115 120 125 

Met Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp Val Val Lys Arg 
130 135 140 

Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala 
145 150 155 160 

Pro Ala Ser His Leu lie Arg Val Glu Gly Asn Asn Leu Ser Gin Tyr 
165 170 175 

Val Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu 

180 185 190 

Pro Pro Gin Val Gly Thr Glu Phe Thr Thr He Leu Tyr Asn Phe Met 
195 200 205 

Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro lie Leu He 
210 215 220 

He He Thr Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg Ser 

225 230 235 240 

Phe Glu Gly Arg He Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp 
245 250 255 

Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys 

260 265 270 

Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala Val 

275 280 285 

Pro Ala Leu Gly Ala Gly Val Lys Lys Arg Arg His Gly Asp Glu Asp 
290 295 300 

Thr Tyr Tyr Leu Gin Val Arg Gly Axg Glu Asn Phe Glu lie Leu Met 
305 310 315 320 

Lys Leu Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu 
325 330 335 

Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser His 
340 345 350 

Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys Val 
355 360 365 

His Gly Gly Met Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly Gin 

370 375 380 

Pro Pro Pro His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val Gly 
385 390 395 400 

Pro Gly Met Leu Asn Asn His Gly His Ala Val Pro Ala Asn Gly Glu 
405 410 415 

Met Ser Ser Ser His Ser Ala Gin Ser Met Val Ser Gly Ser His Cys 

420 425 430 

Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser Phe Leu 
435 440 445 

Thr Gly Leu Gly Cys Pro Asn Cys He Glu Tyr Phe Thr Ser Gin Gly 
450 455 460 
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Leu Gin Ser He Tyr His Leu Gin Asn Leu Thr lie Glu Asp Leu Gly 
465 470 475 * 48 £ 

Ala Leu Lys lie Pro Glu Gin Tyr Arg Met Thr He Tro Arg Gly Leu 
485 490 * 495 

Gin Asp Leu Lys Gin Gly His Asp Tyr Ser Thr Ala Gin Gin Leu Leu 
500 505 510 

Arg Ser Ser Asn Ala Ala Thr He Ser lie Gly Gly Ser Gly Glu Leu 
515 520 525 

Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His Thr 
530 535 540 

lie Thr He Pro Asn Arg Gly Gly Pro Gly Gly Gly Pro Asp Glu T-p 
545 550 555 560 

Ala Asp Phe Gly Phe A3p Leu Pro Asp Cys Lys Ala Arg Lys Gin Pro 
565 570 575 

lie Lys Glu Glu Phe Thr Glu Ala Glu He His 
580 585 

(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1521 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ll! MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
ATGCTGTACG TCGGTGACCC CGCACGGCAC CTCGCCACGG CCCAGTTCAA TCTGCTGAGC 
AGCACCATGG AC C AG AT GAG CAGCCGCGCG GCCTCGGCCA GCCCCTACAC CCCAGAGCAC 
GCCGCCAGCG TGCCCACCCA CTCGCCCTAC GCACAACCCA GCTCCACCTT CGACACCATG 
TCGCCGGCGC CTGTCATCCC CTCCAACACC GACTACCCCG GACCCCACCA CTTTGAGGTC 
ACTTTCCAGC AGTCCAGCAC GGCCAAGTCA GCGACCTGGA CGTACTCCCC GCTCTTGAAG 
AAACTCTACT GCCAGATCGC CAAGACATGC CCCATCCAGA TCAAGGTGTC CACCCCGCCA 
CCCCCAGGCA CTGCCATCCG GGCCATGCCT GTTTACAAGA AAGC GGAGC A CGTGACCGAC 
GTCGTGAAAC GCTGCCCCAA CCACGAGCTC GGGAGGGACT TCAACGAAGG ACAGTCTGCT 480 
CCAGCCAGCC ACCTCATCCG CGTGGAAGGC AATAATCTCT CGCAGTATGT GGATGACCCT 
GTCACCGGCA GGCAGAGCGT CGTGGTGCCC TATGAGCCAC CACAGGTGGG GACGGAATTC 
ACCACCATCC TGTACAACTT CATGTGTAAC AGCAGCTGTG TAGGGGGCAT GAACCGGCGG 6 60 

CCCATCCTCA TCATCATCAC CCTGGAGATG CGGGATGGGC AGGTGCTGGG CCGCCGGTCC 720 
TTTGAGGGCC GCATCTGCGC CTGTCCTGGC CGCGACCGAA AAGC T GAT GA GGACCACTAC 730 
CGGGAGCAGC AGGCCCTGAA CGAGAGCTCC GCCAAGAACG GGGCCGCCAG CAAGCGTGCC 3 40 

T T C AAGC AGA GCCCCCCTGC CGTCCCCGCC CTTGGTGCCG GTGTGAAGAA GCGGCGGCAT 900 
GGAGAC GAGG ACACGTACTA CCTTCAGGTG CGAGGCCGGG AGAACTTTGA GATCCTGATG 9 60 
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AAGCTGAAAG AGAGCCTGGA GCTGATGGAG TTGGTGCCGC AGCCACTGGT GGACTCCTAT 1020 

CGGCAGCAGC AGCAGCTCCT ACAGAGGCCG CCCCGGGATG CTCAACAACC ATGGCCACGC 10 8 0 

AGTGCCAGCC AACGGCGAGA TGAGCAGCAG CCACAGCGCC CAGTCCATGG TCTCGGGGTC 1140 

CCACTGCACT CCGCCACCCC CCTACCACGC CGACCCCAGC CTCGTCAGGA CCTGGGGGCC 1200 

CTGAAGATCC CCGAGCAGTA CCGCATGACC ATCTGGCGGG GCCTGCAGGA CCTGAAGCAG 12 60 

GGCCACGACT ACAGCACCGC GCAGCAGCTG CTCCGCTCTA GCAACGCGGC CACCATCTCC 1320 

ATCGGCGGCT CAQGGGAACT GCAGCGCCAG CGGGTCATGG AGGCCGTGCA CTTCCGCGTG 133 0 

CGCCACACCA TCACCATCCC CAACCGCGGC GGCCCAGGCG GCGGCCCTGA CGAGTGGGCG 14 4 0 

GACTTCGGCT TCGACCTGCC CGACTGCAAG GCCCGCAAGC AGCCCATCAA GGAGGAGTTC 1500 

ACGGAGGCCG AGATCCACTG A 1521 
(2) INFORMATION FOR SEQ ID NO: 15: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 506 amino acids 

(B) TYPE*, amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Leu Tyr Val Gly Asp Pro Ala Arg His Leu Ala Thr Ala Gin ?he 

15 10 15 

Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala Ser 

20 25 30 

Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His Ser 

35 40 45 

Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro 

50 55 60 

Val lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val 

65 70 75 80 

Thr Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser 

85 90 95 



Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala 

145 ISO 1S5 160 
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Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro n e Leu lie 

210 215 220 

lie lie Thr Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg Ser 
225 230 235 240 

Phe Glu Gly Arg lie Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp 
245 250 255 

Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys 
260 265 270 

Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala Val 
275 280 285 

Pro Ala Leu Gly Ala Gly Val Lys Lys Arg Arg His Gly Asp Glu Asp 
290 295 300 

Thr Tyr Tyr Leu Gin Val Arg Gly Arg Glu Asn Phe Glu lie Leu Met 
305 310 315 320 

Lys Leu Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu 
325 330 335 

Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Pro Arg 
340 345 350 

Asp Ala Gin Gin Pro Trp Pro Arg Ser Ala Ser Gin Arg Arg Asp Glu 
355 360 365 

Gin Gin Pro Gin Arg Pro Val His Gly Leu Gly Val Pro Leu His Ser 
370 375 380 

Ala Thr Pro Leu Pro Arg Arg Pro Gin Pro Arg Gin Asp Leu Gly Ala 
385 390 395 400 

Leu Lys lie Pro Glu Gin Tyr Arg Met Thr lie Trp Arg Gly Leu Gin 
405 410 415 

Asp Leu Lys Gin Gly His Asp Tyr Ser Thr Ala Gin Gin Leu Leu Arg 
420 425 430 

Ser Ser Asn Ala Ala Thr lie Ser lie Gly Gly Ser Gly Glu Leu Gin 
435 440 445 

Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His Thr lie 
450 455 460 

Thr lie Pro Asn Arg Gly Gly Pro Gly Gly Gly Pro Asp Glu Trp Ala 
465 470 475 480 

Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ala Arg Ly3 Gin Pro lie 
485 490 495 

Lys Glu Glu Phe Thr Glu Ala Glu lie His 
500 505 

INFORMATION FOR SEQ ID NO: 16: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1870 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: CDS 



(B) LOCATION: 104.. 1367 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TSCCCGGGGC TGCGACGGCT GCAGGGAACC AGACAGCACC TACTTCGACC TTCCCCAGTC 

AAGCCGGGGG AATAATGAGG TGGTGGGCGG AACGGATTCC AGC ATG GAC GTC TTC 

Met Asp Val Phe 
1 

CAC CTG GAG GGC ATG ACT ACA TCT GTC ATG GCC CAG TTC AAT CTG CTG 
His Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin Phe Asn Leu Leu 

5 10 15 20 

AGC AGC ACC ATG GAC CAG ATG AGC AGC CGC GCG GCC TCG GCC AGC CCC 
Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala Ser Ala Ser Pro 
25 30 35 

TAC ACC CCA GAG CAC GCC GCC AGC GTG CCC ACC CAC TCG CCC TAC GCA 
Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His Ser Pro Tyr Ala 
40 45 50 

CAA CCC AGC TCC ACC TTC GAC ACC ATG TCG CCG GCG CCT GTC ATC CCC 
Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala Pro Val lie Pro 
55 60 65 

TCC AAC ACC GAC TAC CCC GGA CCC CAC CAC TTT GAG GTC ACT TTC CAG 
Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu Val Thr Phe Gin 
70 75 80 

CAG TCC AGC ACG GCC AAG TCA GCC ACC TGG ACG TAC TCC CCG CTC TTG 
Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr Ser Pro Leu Leu 

35 90 95 100 

AAG AAA CTC TAC TGC CAG ATC GCC AAG ACA TGC CCC ATC CAG ATC AAG 
Lys Lys Leu Tyr Cys Gin lie Ala Lys Thr Cya Pro He Gin lie Lys 
105 110 115 

GTG TCC ACC CCG CCA CCC CCA GGC ACT GCC ATC CGG GCC ATG CCT GTT 
Val Ser Thr Pro Pro Pro Pro Gly Thr Ala lie Arg Ala Met Pro Val 
120 125 130 

TAC AAG AAA GCG GAG CAC GTG ACC GAC GTC GTG AAA CGC TGC CCC AAC 
Tyr Lys Lys Ala Glu His Val Thr Asp Val Val Lys Arg Cys Pro Asn 
135 140 145 

CAC GAG CTC GGG AGG GAC TTC AAC GAA GGA CAG TCT GCT CCA GCC AGC 
His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser Ala Pro Ala Ser 
150 155 160 

CAC CTC ATC CGC GTG GAA GGC AAT AAT CTC TCG CAG TAT GTG GAT GAC 
His Leu lie Arg Val Glu Gly Asn Asn Leu Ser Gin Tyr Val Asp Asp 

165 170 175 130 

CCT GTC ACC GGC AGG CAG AGC GTC GTG GTG CCC TAT GAG CCA CCA CAG 
Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr Glu Pro Pro Gin 
185 190 195 

GTG GGG ACG GAA TTC ACC ACC ATC CTG TAC AAC TTC ATG TGT AAC AGC 
Val Gly Thr Glu Phe Thr Thr lie Leu Tyr Asn Phe Met Cys Asn Ser 

200 205 210 

AGC TGT GTA GGG GGC ATG AAC CGG CGG CCC ATC CTC ATC ATC ATC ACC 
Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu lie Xle He Thr 
215 220 225 

CTG GAG ATG CGG GAT GGG CAG GTG CTG GGC CGC CGG TCC TTT GAG GGC 
Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg Ser Phe Glu Gly 
230 235 240 



CGC ATC TGC GCC TGT CCT GGC CGC GAC CGA AAA GCT GAT GAG GAC CAC 



333 
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Arg lie Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala Asp Glu Asp His 
245 250 255 2 60 

TAC CGG GAG CAG CAG GCC CTG AAC GAG AGC TCC GCC AAG AAC GGG GCC 
Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala Lys Asn G 1 y AT a 
265 270 275 

GCC AGC .AAG CGT GCC TTC AAG CAG AGC CCC CCT GCC GTC CCC GCC CTT 
Ala Ser Lys Arg Ala ?he Lys Gin Ser Pro Pro Ala Val p-o A 1 a t su 
250 295 290 

GGT GCC GGT GTG AAG AAG CGG CGG CAT GGA GAC GAG GAC ACG TAC TAC 
Gly Ala Gly Val Lys Lys Arg Arg His Gly Asp Glu Asp Thr Tyr Tvr 
295 300 305 

CTT CAG GTG CGA GGC CGG GAG AAC TTT GAG ATC CTG ATG AAG CTG AAA 
Leu Gin Val Arg Gly Arg Glu Asn Phe Glu lie Leu Met Lys Leu Lvs 
310 315 320 

GAG AGC CTG GAG CTG ATG GAG TTG GTG CCG CAG CCA CTG GTG GAC TCC 
Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro Leu Val Asp Ser 
325 330 335 340 

TAT CGG CAG CAG CAG CAG CTC CTA CAG AGG CCG AGT CAC CTA CAG CCC 
Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser His Leu Gin Pro 
345 350 355 

CCG TCC TAC GGG CCG GTC CTC TCG CCC ATG AAC AAG GTG CAC GGG GGC 
Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys Val His Gly Gly 
360 365 370 

ATG AAC AAG CTG CCC TCC GTC AAC CAG CTG GTG GGC CAG CCT CCC CCG 
Met Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly Gin Pro Pro Pro 
375 380 385 

CAC AGT TCG GCA GCT ACA CCC AAC CTG GGG CCC GTG GGC CCC GGG ATG 
His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val Gly Pro Gly Met 
390 395 400 

CTC AAC AAC CAT GGC CAC GCA GTG CCA GCC AAC GGC GAG ATG AGC AGC 
Leu Asn Asn His Gly His Ala Val Pro Ala Asn Gly Glu Met Ser Ser 
405 410 415 420 

AGC CAC AGC GCC CAG TCC ATG GTC TCG GGG TCC CAC TGC ACT CCG CCA 
Ser His Ser Ala Gin Ser Met Val Ser Gly Ser His Cys Thr Pro Pro 
425 430 435 

CCC CCC TAC CAC GCC GAC CCC AGC CTC GTC AGT TTT TTA ACA GGA TTG 
Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser Phe Leu Thr Gly Leu 
440 445 450 

GGG TGT CCA AAC TGC ATC GAG TAT TTC ACC TCC CAA GGG TTA CAG AGC 
Gly Cys Pro Asn Cys He Glu Tyr Phe Thr Ser Gin Gly Leu Gin Ser 
455 460 465 

ATT TAC CAC CTG CAG AAC CTG ACC ATT GAG GAC CTG GGG GCC CTG AAG 
He Tyr His Leu Gin Asn Leu Thr He Glu Asp Leu Gly Ala Leu Lvs 
4^0 475 480 

ATC CCC GAG CAG TAC CGC ATG ACC ATC TGG CGG GGC CTG CAG GAC CTG 
lie Pro Glu Gin Tyr Arg Met Thr lie Trp Arg Gly Leu Gin Asd Leu 
48S 490 495 * 500 

AAG CAG GGC CAC GAC TAC AGC ACC GCG CAG CAG CTG CTC CGC TCT AGC 
Lys Gin Gly His Asp Tyr Ser Thr Ala Gin Gin Leu Leu Arg Ser Ser 
505 510 515 

AAC GCG GCC ACC ATC TCC ATC GGC GGC TCA GGG GAA CTG CAG CGC CAG 
Asn Ala Ala Thr lie Ser He Gly Gly Ser Gly Glu Leu Gin Arg Gin 
520 525 530 

CGG GTC ATG GAG GCC GTG CAC TTC CGC GTG CGC CAC ACC ATC ACC ATC 
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Arg Val Met Glu Ala Val His Phe Arg Val Arg His Thr lie Th- lie 
=35 540 545 

CCC AAC CGC GGC GGC CCA GGC GGC GGC CCT GAC GAG TGG GCG GAC TTC 
Pro Asn Arg Gly Gly Pro Giy Gly Gly Pro Asp Glu Trp Ala Asd Phe 
550 555 560 

GGC TTC GAC CTG CCC GAC TGC AAG GCC CGC AAG CAG CCC ATC AAG GAG 
Gly Phe Asp Leu Pro Asp Cys Lys Ala Arg Lys Gin Pro Tie Lys Glu 
565 570 575 580 

GAG TTC ACG GAG GCC GAG ATC CAC TGA 
Glu Phe Thr Glu Ala Glu lie His 
585 

(2) INFORMATION FOR SEQ ID NO: 17: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 58 8. amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Asp Val Phe His Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin 
1 5 10 15 

Phe Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala 

20 25 30 

Ser Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His 
35 40 45 

Ser Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala 

50 55 60 

Pro Val lie Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu 

55 70 75 80 

Val Thr Phe Gin Gin Ser Ser Thr Ala Lys Ser Ala Thr Trp Thr Tyr 

85 90 95 

Ser Pro Leu Leu Lys Lys Leu Tyr Cys Gin He Ala Lys Thr Cys Pro 

100 105 no 

He Gin He Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala He Ara 
115 120 125 

Ala Met Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp Val Val Lys 
130 135 140 

Arg Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser 
145 150 155 160 

Ala Pro Ala Ser His Leu He Arg Val Glu Gly Asn Asn Leu Ser Gin 

165 170 175 

Tyr Val Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr 

180 185 190 

Glu Pro Pro Gin Val Gly Thr Glu Phe Thr Thr He Leu Tyr Asn Phe 

195 200 205 

Met Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu 
210 215 220 

He He He Thr Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg 
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Ser Phe GIu Giy Arg He Cys Ala Cys Pro Gly Arg Asp Arg Lys Ala 
245 250 255 

Asp Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala 

260 265 270 

Lys Asn Gly Ala Ala Ser Lys Arg Ala Phe Lys Gin Ser Pro Pro Ala 

275 280 285 

Val Pro Ala Leu Gly Ala Gly Val Lys Lys Arg Arg His Gly Asp Glu 

290 295 300 

Asp Thr Tyr Tyr Leu Gin Val Arg Gly Arg Glu Asn Phe Glu lie Leu 
305 310 315 320 

Met Lys Leu Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro 
325 330 335 

Leu Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 
340 345 350 

His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys 
355 360 365 

Val His Gly Gly Met Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly 
370 375 380 

Gin Pro Pro Pro His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val 
385 390 395 400 

Gly Pro Gly Met Leu Asn Asn His Gly His Ala Val Pro Ala Asn Gly 

405 410 415 

Glu Met Ser Ser Ser His Ser Ala Gin Ser Met Val Ser Gly Ser His 
420 425 430 

Cys Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Ser Phe 
435 440 445 

Leu Thr Gly Leu Gly Cys Pro Asn Cys He Glu Tyr Phe Thr Ser Gin 
450 455 460 

Gly Leu Gin Ser He Tyr His Leu Gin Asn Leu Thr He Glu Asp Leu 
465 470 475 480 

Gly Ala Leu Lys He Pro Glu Gin Tyr Arg Met Thr He Trp Arg Gly 
435 490 495 

Leu Gin Asp Leu Lys Gin Gly His Asp Tyr Ser Thr Ala Gin Gin Leu 
500 505 510 

Leu Arg Ser Ser Asn Ala Ala Thr He Ser He Gly Gly Ser Gly Glu 
515 520 525 

Leu Gin Arg Gin Arg Val Met Glu Ala Val His Phe Arg Val Arg His 
530 535 540 

Thr He Thr He Pro Asn Arg Gly Gly Pro Gly Gly Gly Pro Asp Glu 
545 550 555 560 

Trp Ala Asp Phe Gly Phe Asp Leu Pro Asp Cys Lys Ala Arg Lys Gin 
565 570 575 

Pro lie Lys Glu Glu Phe Thr Glu Ala Glu He His 
580 585 

(2) INFORMATION FOR SEQ ID NO: 18: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1817 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
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(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapien3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGCCCAGT CCACCGCCAC CTCCCCTGAT GGGGGCACCA CGTTTGAGCA CCTCTGGAGC 60 

TCTCTGGAAC CAGACAGCAC CTACTTCGAC CTTCCCCAGT CAAGCCGGGG GAATAATGAG 12 0 

GTGGTGGGCG GAACGGATTC CAGCATGGAC GTCTTCCACC T GGAGGGC AT GACTACATCT 180 

GTCATGGCCC AGTTCAATCT GCTGAGCAGC ACCATGGACC AGATGAGCAG CCGCGCGGCC 24 0 

TCGGCCAGCC CCTACACCCC AGAGCACGCC GCCAGCGTGC CCACCCACTC GCCCTACGCA 300 

CAACCCAGCT CCACCTTCGA CACCATGTCG CCGGCGCCTG TCATCCCCTC CAACACCGAC 360 

TACCCCGGAC CCCACCACTT TGAGGTCACT TTCCAGCAGT CCAGCACGGC CAAGTCAGCC 4 20 

ACCTGGACGT ACTCCCCGCT CTTGAAGAAA CTCTACTGCC AGATCGCCAA GACATGCCCC 480 

ATCCAGATCA AGGTGTCCAC CCCGCCACCC CCAGGCACTG CCATCCGGGC CATGCCTGTT 5 40 

TACAAGAAAG CGGAGCACGT GACCGACGTC GTGAAACGCT GCCCCAACCA CGAGCTCGGG 60 0 

AGGGAC T T C A AC G AAGGAC A GTCTGCTCCA GCCAGCCACC TCATCCGCGT GGAAGGCAAT 660 

AATCTCTCGC AGTATGTGGA TGACCCTGTC ACCGGCAGGC AGAGCGTCGT GGTGCCCTAT 720 

GAGCCACCAC AGGT GGGGAC GGAATTCACC ACCATCCTGT ACAACTTCAT GTGTAACAGC 78 0 

AGCTGTGTAG GGGGCATGAA CCGGCGGCCC ATCCTCATCA TCATCACCCT -GGAGATGCGG 8 40 

GATGGGCAGG TGCTGGGCCG CCGGTCCTTT GAGGGCC GCA TCTGCGCCTG TCCTGGCCGC 900 

GACCGAAAAG CTGATGAGGA CCACTACCGG GAGCAGCAGG CCCTGAACGA GAGCTCCGCC 9 60 

AAGAACGGGG CCGCCAGCAA GCGTGCCTTC AAGCAGAGCC CCCCTGCCGT CCCCGCCCTT 1020 

GGTGCCGGTG TGAAGAAGCG GCGGCATGGA GAC GAGGAC A CGTACTACCT TCAGGTGCGA 10 8 0 

GGCCGGGAGA ACTTTGAGAT CCTGATGAAG CTGAAAGAGA GCCTGGAGCT GATGGAGTTG 1140 

GTGCCGCAGC CACTGGTGGA CTCCTATCGG CAGCAGCAGC AGCTCCTACA GAGGCCGAGT 1200 

CACCTACAGC CCCCGTCCTA CGGGCCGGTC CTCTCGCCCA TGAACAAGGT GCACGGGGGC 12 60 

ATGAACAAGC TGCCCTCCGT CAACCAGCTG GTGGGCCAGC CTCCCCCGCA CAGTTCGGCA 1320 

GCTACACCCA ACCTGGGGCC CGTGGGCCCC GGGATGCTCA ACAACCATGG CCACGCAGTG 1380 

CCAGCCAACG GCGAGATGAG CAGCAGCCAC AGCGCCCAGT CCATGGTCTC GGGGTCCCAC 14 40 

TGCACTCCGC CACCCCCCTA CCACGCCGAC CCCAGCCTCG TCAGGACCTG GGGGCCCTGA 15 00 

AGATCCCCGA GCAGTACCGC ATGACCATCT GGCGGGGCCT GCAGGACCTG AAGCAGGGCC 15 60 

AC GAC T AC AG CACCGCGCAG CAGCTGCTCC GCTCTAGCAA CGCGGCCACC ATCTCCATCG 1620 

GCGGCTCAGG GGAACTGCAG CGCCAGCGGG TCATGGAGGC CGTGCACTTC CGCGTGCGCC 1680 

ACACCATCAC CATCCCCAAC CGCGGCGGCC CAGGCGGCGG CCCTGACGAG TGGGCGGACT 17 40 

TCGGCTTCGA CCTGCCCGAC TGCAAGGCCC GCAAGCAGCC CATCAAGGAG GAGTTCACGG 18 00 

AGGCCGAGAT CCACTGA 1317 



) INFORMATION FOR SEQ ID NO: 19 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 99 amino acids 
(3) TYPE: ammo acid 
(D) TOPOLOGY: linear 

ii) MOLECULE TYPE: protein 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Ala Gin Ser Thr Ala Thr Ser Pro Asp Gly Gly Thr Thr Phe Glu 
1 5 10 15 

His Leu Trp Ser Ser Leu Glu Pro Asp Ser Thr Tyr Phe Asp Leu Pro 

20 25 30 

Gin Ser Ser Arg Gly Asn Asn Glu Val Val Gly Gly Thr Asp Ser Ser 

35 40 45 

Met Asd Val Phe His Leu Glu Gly Met Thr Thr Ser Val Met Ala Gin 
50 55 60 

Phe Asn Leu Leu Ser Ser Thr Met Asp Gin Met Ser Ser Arg Ala Ala 

65 70 75 80 

Ser Ala Ser Pro Tyr Thr Pro Glu His Ala Ala Ser Val Pro Thr His 

35 90 95 

Ser Pro Tyr Ala Gin Pro Ser Ser Thr Phe Asp Thr Met Ser Pro Ala 

100 105 110 

Pro Val He Pro Ser Asn Thr Asp Tyr Pro Gly Pro His His Phe Glu 

115 120 125 



Ser Pro Leu Leu Lys Lys Leu Tyr Cys Gin He Ala Lys Thr Cys Pro 

145 ISO 155 160 

He Gin He Lys Val Ser Thr Pro Pro Pro Pro Gly Thr Ala He Arg 
165 170 175 

Ala Met Pro Val Tyr Lys Lys Ala Glu His Val Thr Asp Val Val Lys 

130 135 190 

Arg Cys Pro Asn His Glu Leu Gly Arg Asp Phe Asn Glu Gly Gin Ser 
195 200 205 

Ala Pro Ala Ser His Leu He Arg Val Glu Gly Asn Asn Leu Ser Gin 
210 215 220 

Tyr Val Asp Asp Pro Val Thr Gly Arg Gin Ser Val Val Val Pro Tyr 
225 230 235 240 

Glu Pro Pro Gin Val Gly Thr Glu Phe Thr Thr He Leu Tyr Asn Phe 
245 250 255 

Met Cys Asn Ser Ser Cys Val Gly Gly Met Asn Arg Arg Pro He Leu 

260 265 270 

He He He Thr Leu Glu Met Arg Asp Gly Gin Val Leu Gly Arg Arg 
275 280 285 

Ser Phe Glu Gly Arg He Cys Ala Cys Pro Gly Arg Asp Arg Ly3 Ala 

290 295 300 

Asp Glu Asp His Tyr Arg Glu Gin Gin Ala Leu Asn Glu Ser Ser Ala 
305 310 315 320 



Lys Asn Gly Ala Ala Ser Lys Arg Ala ?he Lys Gin Ser Pro Pro Ala 
325 330 33S 

Val Pro Ala Leu Gly Ala Gly Val Lys Lys Arg Arg His Gly Asd Glu 
340 345 350 

Asp Thr Tyr Tyr Leu Gin Val Arg Gly Arg Glu Asn ?he Glu He Leu 

355 360 365 

Met Lys Leu Lys Glu Ser Leu Glu Leu Met Glu Leu Val Pro Gin Pro 
370 375 380 

Leu Val Asp Ser Tyr Arg Gin Gin Gin Gin Leu Leu Gin Arg Pro Ser 
385 390 395 400 

His Leu Gin Pro Pro Ser Tyr Gly Pro Val Leu Ser Pro Met Asn Lys 
405 410 415 

Val His Gly Gly Met Asn Lys Leu Pro Ser Val Asn Gin Leu Val Gly 
420 425 430 

Gin Pro Pro Pro His Ser Ser Ala Ala Thr Pro Asn Leu Gly Pro Val 
43S 440 44S 

Gly Pro Gly Met Leu Asn Asn His Gly His Ala Val Pro Ala Asn Gly 
450 455 460 

Glu Met Ser Ser Ser His Ser Ala Gin Ser Met Val Ser Gly Ser His 

465 470 475 480 

Cys Thr Pro Pro Pro Pro Tyr His Ala Asp Pro Ser Leu Val Arg Thr 
485 490 495 



(2) INFORMATION FOR SEQ ID NO: 20: 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) HYPOTHETICAL: NO 
(111) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GCGAGCTGCC CTCGGAG 17 
(2) INFORMATION FOR SEQ ID NO: 21: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(111) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



GGTTCTGCAG GTGACTCAG 



INFORMATION FOR SEQ ID NO: 22: 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
(3) TYPE: nucleic acid 
'O 3TRANDEDNESS: single 
(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DMA 

(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GCCATGCCTG TCTACAAG 
(2) INFORMATION FOR SEQ ID NO: 23: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
ACCAGCTGGT TGACGGAG 

(2) INFORMATION FOR SEQ ID NO: 24: 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
<B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(111) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
GTCAACCAGC TGGTGGGCCA G 
(2) INFORMATION FOR SEQ ID NO: 25: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
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GTGGATCTCG GCCTCC 

!2) INFORMATION FOR SEQ ID NO: 26: 

,i) SEQUENCE CHARACTERISTICS: 
i'A) LENGTH: 17 base pairs 
'3) TYPE: nucleic acid 
C> STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 

(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
AGGCCGGCGT GGGGAAG 

(2) INFORMATION FOR SEQ ID NO: 27: 

(l) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 19 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
CTTGGCGATC TGGCAGTAG 
(2) INFORMATION FOR SEQ ID NO: 23: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GCGGCCACGA CCGTGAC 
(2) INFORMATION FOR SEQ ID NO: 29: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
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GGCAGCTTGG GTCTCTGG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CTGTACGTCG GTGACCCC 18 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TCAGTGGATC TCGGCCTC IS 
(2) INFORMATION FOR SEQ ID NO: 32: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AGGGGAC GC A GCGAAACC 18 
(2) INFORMATION FOR SEQ ID NO: 33: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CCATCAGCTC CAGGCTCTC 

<2, INFORMATION FOR SEC. ID NO: 34: 

!D SEQUENCE CHARACTERISTICS • 
(A) LENGTH: 18 base pairs 
(3) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 

CCAGGACAGG C GC AGATG 

(2) INFORMATION FOR SEQ ID NO: 35: 

(1) SEQUENCE CHARACTERISTICS- 
(Aj LENGTH: 19 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE : YES 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 
GATGAGGTGG CTGGCTGGA 
(2) INFORMATION FOR SEQ ID NO: 36: 

(1) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA 
(Hi) ANTI -SENSE: YES 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 3 
TGGTCAGGTT CTGCAGGTG 
(2) INFORMATION FOR SEQ ID NO: 37: 

(1) SEQUENCE CHARACTERISES- 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

ill) MOLECULE TYPE: DNA 
fill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 



CACCTACTCC AGGGATGC 



) INFORMATION FOR SEQ ID NO: 38: 

fi) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
(3) TYPE: nucleic acid 
'O STRANDEDNESS: single 
!D) TOPOLOGY: linear 

(li) MOLECULE TYPE: CNA 

(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

AGGAAAATAG AAGCGTCAGT C 

(2) INFORMATION FOR SEQ ID NO: 39: 

(l) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CAGGCCCACT TGCCTGCC 

.(2) INFORMATION FOR SEQ ID NO: 40: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ll) MOLECULE TYPE: DNA 
(ill) ANTI-SENSE: YES 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
CTGTCCCCAA GCTGATGAG 



87 
CLAIMS 

1. Purified polypeptide, comprising an amino acid 

sequence selected from the group consisting of: 





a) 


the 


sequence 


SEQ 


ID 


No 


2; 


5 


b) 


the 


sequence 


SEQ 


ID 


No 


4; 




c) 


the 


sequence 


SEQ 


ID 


No. 


6; 




d) 


the 


sequence 


SEQ 


ID 


No. 


8; 




e) 


the 


sequence 


SEQ 


ID 


No. 


10; 




f) 


the 


sequence 


SEQ 


ID 


No. 


13; 


10 


g) 


the 


sequence 


SEQ 


ID 


No. 


15; 




h) 


the 


sequence 


SEQ 


ID 


No. 


17; 




i) 


the 


sequence 


SEQ 


ID 


No. 


In- 




and 


j> 


any biologically 


active 



from SEQ ID No. 2, SEQ ID No . 4, SEQ ID No . 6, SEQ ID 
15 No. 8, SEQ ID No. 10, SEQ ID No. 13, SEQ ID No. 15, SEQ 
ID No. 17 or SEQ ID No. 19. 

2. Polypeptide according to Claim 1, characterized 
in that it comprises the amino acid sequence selected 
from the group consisting of SEQ ID No. 6, SEQ ID No. 

20 13, SEQ ID No. 15, SEQ ID No. 17 and SEQ ID No. 19. 

3. Polypeptide according to Claim 1, characterized 
in that it comprises the sequence lying between: 

- residue 110 and residue 310 of SEQ ID No. 2 or 6; 

- residue 60 and residue 260 of SEQ ID No. 8. 

25 4. Polypeptide according to Claim 1, characterized 

in that it results from an alternative splicing of the 
messenger RNA of the corresponding gene. 

5. Polypeptide according to any one of the 

preceding claims, characterized in that it is a 
30 recombinant polypeptide produced in the form of a 
fusion protein. 
/ 6. Isolated nucleic acid sequence coding for a 

polypeptide according to any one of the preceding 
claims . 

35 7 • Isolated nucleic acid sequence according to 

Claim 6, characterized in that it is selected from the 
group consisting of : 

a) the sequence SEQ ID No. 1; 

b) the sequence SEQ ID No. 3; 
4 0 c) the sequence SEQ ID No. 5; 



d) the sequence SEQ ID No . 7 ; 

e) the sequence SEQ ID No. 9; 

f) the sequence SEQ ID No . 11; 

g) the sequence SEQ ID No. 12; 

h) the sequence SEQ ID No. 14; 

i) the sequence SEQ ID No. 16; 
j) the sequence SEQ ID No. 18; 

k) the nucleic acid sequences capable of 
hybridizing specifically with the sequence SEQ ID 
No. 1, SEQ ID No. 3, SEQ ID No. 5, SEQ ID No . 7, 
SEQ ID No. 9, SEQ ID No. 11, SEQ ID No. 12, SEQ ID 
No. 14, SEQ ID No. 16 or SEQ ID No. 18 or with the 
sequences complementary to them, or of hybridizing 
specifically with their proximal sequences ; 
and 1) the sequences derived from the sequences 
a), b) , c) , d) , e) , f ) , g) , h) , i) , j) or 1c) as a 
result of the degeneracy of the genetic code, 
mutation, deletion, insertion, and alternative 
splicing or an allelic variability. 

8. Nucleotide sequence according to Claim 6, 
characterized in that it is a sequence selected from 
SEQ ID No. 5, SEQ ID No. 12, SEQ ID No. 14, SEQ ID No. 
16 and SEQ ID No. 18 coding, respectively, for the 
polypeptide of sequences SEQ ID No. 6, SEQ ID No. 13, 
SEQ ID No. 15, SEQ ID No. 17 and SEQ ID No. 19. 

9. Cloning and/or expression vector containing a 
nucleic acid sequence according to any one of Claims 6 
to 8. 

10. Vector according to Claim 9, characterized in 
that it is the plasmid pSEl. 

11. Host cell transfected by a vector according to 
Claim 9 or 10. 

12. Transfected host cell according to Claim 11, 
characterized in that it is E. coll MC 1061. 

13. Nucleotide probe or nucleotide primer, 
characterized in that it hybridizes specifically with 
any one of the sequences according to Claims 6 to 8 or 
the sequences complementary to them or the 
corresponding messenger RNAs or the corresponding 
genes . 
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14. Probe or primer according to Claim 13, 
characterized in that it contains at least 16 
nucleotides . 

15. Probe or primer according to Claim 13, 
characterized in that it comprises the whole of the 
sequence of the gene coding for one of the polypeptides 
of Claim 1. 

16. Nucleotide probe or primer selected from the 
group consisting of the following oligonucleotides or 
sequences complementary to them: 



SEQ 


ID 


No. 


20: 


GCG 


AGC 


TGC CCT CGG 


AG 




SEQ 


ID 


No. 


21: 


GGT 


TCT 


GCA GGT GAC 


TCA 


G 


SEQ 


ID 


No. 


22: 


GCC 


ATG 


CCT GTC TAC 


AAG 




SEQ 


ID 


No. 


23: 


ACC 


AGC 


TGG TTG ACG 


GAG 




SEQ 


ID 


No. 


24 : 


GTC 


AAC 


CAG CTG GTG 


GGC 


CAG 


SEQ 


ID 


No. 


25: 


GTG 


GAT 


CTC GGC CTC 


c 




SEQ 


ID 


No. 


26: 


AGG 


CCG 


GCG TGG GGA 


AG 




SEQ 


ID 


No. 


27: 


CTT 


GGC 


GAT CTG GCA 


GTA 


G 


SEQ 


ID 


No. 


28: 


GCG 


GCC 


ACG ACC GTG 


AC 




SEQ 


ID 


No. 


29: 


GGC 


AGC 


TTG GGT CTC 


TGG 




SEQ 


ID 


No. 


30: 


CTG 


TAC 


GTC GGT GAC 


CCC 




SEQ 


ID 


No. 


31: 


TCA 


GTG 


GAT CTC GGC 


CTC 




SEQ 


ID 


NO. 


32: 


AGG 


GGA 


CGC AGC GAA 


ACC 




SEQ 


ID 


No. 


33: 


CCA 


TCA 


GCT CCA GGC 


TCT 


C 


SEQ 


ID 


No. 


34: 


CCA 


GGA 


CAG GCG CAG 


ATG 




SEQ 


ID 


No. 


35: 


GAT 


GAG 


GTG GCT GGC 


TGG 


A 


SEQ 


ID 


No. 


36: 


TGG 


TCA 


GGT TCT GCA 


GGT 


G 


SEQ 


ID 


No. 


37: 


CAC 


CTA 


CTC CAG GGA 


TGC 




SEQ 


ID 


No. 


38: 


AGG 


AAA 


ATA GAA GCG 


TCA 


GTC 


SEQ 


ID 


NO. 


39: 


CAG 


GCC 


CAC TTG CCT 


GCC 




and 


SEQ ID 


No. 


40: 


CTG 


TCC CCA AGC 


TGA 


TGA G 


17. Use 


of 


a 


sequence 


according 


to 


any one of 


Claims 6 


to 


8, 


for 


the 


manufacture of 


oligonucleotide 



primers for sequencing reactions or specific 
amplification reactions according to the PCR technique 
or any variant of the latter. 

18. Nucleotide primer pair, characterized in that 

it comprises the primers selected from the group 
consisting of the following sequences: 
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10 d) 



15 

f) 



sense primer: GCG AGC TGC CCT CGG AG (SEQ ID No. 20) 
antisense primer: GGT TCT GCA GGT GAC TCA G (SEQ ID No. 21) 
sense primer: GCC ATG CCT GTC TAC AAG (SEQ ID No. 22) 
antisense primer: ACC AGC TGG TTG ACG GAG (SEQ ID No. 23) 

sense primer: GTC AAC CAG CTG GTG GGC CAG (SEQ ID No . 24) 
antisense primer: GTG GAT CTC GGC CTC C (SEQ ID No . 25) 

sense primer: AGG CCG GCG TGG GGA AG (SEQ ID No. 26) 
antisense primer: CTT GGC GAT CTG GCA GTA G (SEQ ID No. 27) 

sense primer: GCG GCC ACG ACC GTG A (SEQ ID No . 28) 
antisense primer: GGC AGC TTG GGT CTC TGG (SEQ ID No. 29) 

sense primer: CTG TAC GTC GGT GAC CCC (SEQ ID No. 30) 
antisense primer: TCA GTG GAT CTC GGC CTC (SEQ ID No. 31) 



g) sense primer: AGG GGA CGC AGC GAA ACC (SEQ ID No. 32) 

20 antisense pxixmx: GGC AGC TTG GGT CTC TGG (SEQ ID No. 29) 

h) sense primer: CCCCCCCCCCCCCCN (where N equals G, A or T) 
antisense primer: CCA TCA GCT CCA GGC TCT C (SEQ ID No. 33) 

25 i) sense primer: CCCCCCCCCCCCCCN (where N equals G, A or T) 
antisense primer: CCA GGA CAG GCG CAG ATG (SEQ ID No. 34) 

j) sense primer: CCCCCCCCCCCCCCCN (where N equals G, A or T> 

antisense primer: CTT GGC GAT CTG GCA GTA G (SEQ ID No. 27) 

30 

k) sense primer: CAC CTA CTC CAG GGA TGC (SEQ ID No. 37) 

antisense primer: AGG AAA ATA GAA GCG TCA GTC (SEQ ID No. 



and 1) sense primer: CAG GCC CAC TTG CCT GCC (SEQ ID No. 39) 

antisense primer: CTG TCC CCA AGC TGA TGA G (SEQ ID No. 40) 

19. Use of a sequence according to any one of 
Claims 6 to 8 , which is usable in gene therapy. 

20. use of a sequence according to any one of 
Claims 6 to 8, for the production of diagnostic 
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nucleotide probes or primers, or of antisense sequences 
which are usable in gene therapy. 

21. Use of nucleotide primers according to any one 
of Claims 6 to 8, for sequencing. 

22. Use of a probe or primer according to any one 
of Claims 13 to 16, as an in vitro diagnostic tool for 
the detection, by hybridization experiments, of nucleic 
acid sequences coding for a polypeptide according to 
any one of Claims 1 to 4 , in biological samples , or for 
the demonstration of aberrant syntheses or of genetic 
abnormalities . 

23. Method of in vitro diagnosis for the detection 
of aberrant syntheses or of genetic abnormalities in 
the nucleic acid sequences coding for a polypeptide 
according to any one of Claims 1 to 4 , characterized in 
that it comprises: 

the bringing of a nucleotide probe according to 
any one of Claims 13 to 16 into contact with a 
biological sample under conditions permitting 
the formation of a hybridization complex 
between the said probe and the abovementxoned 
nucleotide sequence, where- appropriate after a 
prior step of amplification of the 
abovementioned nucleotide sequence; 
the detection of the hybridization complex 
possibly formed; 
- where appropriate, the sequencing of the 
nucleotide sequence forming the hybridization 
complex with the probe of the invention. 

24 . Use of a nucleic acid sequence according to any 
one of Claims 6 to 8, for the production of a 
recombinant polypeptide according to any one of Claims 
1 to 5. 

25. Method of production of a recombinant SR-p70 
protein, characterized in that transfected cells 
according to Claim 10 or 11 are cultured under 
conditions permitting the expression of a recombinant 
polypeptide of sequence SEQ ID No. 2, SEQ ID No. 4, SEQ 
ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 13, 
SEQ ID No. 15, SEQ ID No. 17 or SEQ ID No. 19 or any 



biologically active fragment or derivative, and in that 
the said recombinant polypeptide is recovered. 

26. Mono- or polyclonal antibodies or their 
fragments, chimeric antibodies or immunoconjugates , 
characterized in that they are capable of specifically 
recognizing a polypeptide according to any one of 
Claims 1 to 4 . 

27. Use of the antibodies according to the 
preceding claim, for the purification or detection of a 
polypeptide according to any one of Claims 1 to 4 in a 
biological sample. 

28. Method of In vitro diagnosis of pathologies 
correlated with an expression or an abnormal 
accumulation of SR-p70 proteins, in particular the 
phenomena of carcinogenesis, from a biological sample, 
characterized in that at least one antibody according 
to Claim 25 is brought into contact with the said 
biological sample under conditions permitting the 
possible formation of specific immunological complexes 
between an SR-p70 protein and the said antibody or 
antibodies, and in that the specific immunological 
complexes possibly formed are detected. 

29. Kit for the in vitro diagnosis of an expression 
or an abnormal accumulation of SR-p70 proteins m a 
biological sample and/or for measuring the level of 
expression of these proteins in the said sample, 
comprising: 

at least one antibody according to Claim 25 , 
optionally bound to a support, 
- means of visualization of the formation of 
specific antigen -antibody complexes between an 
SR-p70 protein and the said antibody, and/or 
means of quantification of these complexes. 

30. Method for the early diagnosis of tumour 
formation, characterized in that autoantibodies 
directed against an SR-p70 protein are demonstrated in 
a serum sample drawn from an individual, according to 
the steps that consist in bringing a serum sample drawn 
from an individual into contact with a polypeptide of 
the invention, optionally bound to a support, under 



conditions permitting the formation of specific 
immunological complexes between the said polypeptide 
and the autoantibodies possibly present in the serum 
sample, and in that the specific immunological 
complexes possibly formed are detected. 

31. Method of determination of an allelic 
variability, a mutation, a deletion, an insertion, a 
loss of heterozygosity or a genetic abnormality of the 
SR-p70 gene, characterized in that it utilizes at least 
one nucleotide sequence according to any one of Claims 
6 to 8. 

32. Method of determination of an allelic 
variability of the SR-p70 gene at position -30 and -20 
relative to the initiation ATG of exon 2 which may be 
involved in pathologies, and characterized in that it 
comprises at least: 

a step during which exon 2 of the SR-p70 gene 
carrying the target sequence is amplified by 
PCR using a pair of oligonuclotide primers 
according to any one of Claims 6 to 8; 

- a step during which the amplified products are 
treated with a restriction enzyme whose 
cleavage site corresponds to the allele sought; 

- a step during which at least one of the 
products of the enzyme reaction is detected or 
assayed. 

33. Pharmaceutical composition comprising as active 
principle a polypeptide according to any one of Claims 
1 to 4. 

34. Pharmaceutical composition according to the 
preceding claim, characterized in that it comprises a 
polypeptide according to Claim 2. 

35 . Pharmaceutical composition containing an 
inhibitor or an activator of SR-p70 activity. 

36. Pharmaceutical composition containing a 
polypeptide derived from a polypeptide according to any 
one of Claims 1 to 5, characterized in that it is an 
inhibitor or an activator of SR-p70. 
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1 TGCCTCCCCGCCCGCGCACCCGCCCCGAGGCCTGTGCTCCTGCGAAGGGG 50 

II Nil 

1 GGGGCTCCGGGG 12 

51 ACGCAGCGAAGCCGGGGCCCGCGCCAGGCCGGCCGGGACGGACGCCGATG 100 

li I I UNI II I I ! I Ml 

13 ACACTTGGCGTCCGGGCTGGAAGCGTGCTTTCCAAGACGGTGACACGCTT 62 
101 CCCGGAGCTGCGACGGCTGCAGAGCGAGCTGCCCTCGGAGGCCGGTGTGA ISO 

III III I II I III MM III I I 

63 CCCTGAGGATTGGCAGCCAGACTGCTTACGGGTCAC . . . TGC C ATGGAGG 109 
151 GGAAGATGGCCCAGTCCACCACCACCTCCCCCGATGGGGGCACCACGTTT 200 

I I II I III Ml till I || ||[ 

110 AGCCGCAGTCAGATCCCAGCATCGAGCCCCCTCTGAGTCAGGAAACATTT 159 
2 01 GAGCACCTCTGGAGCTCTCTGGAACCAGACAGCACCTACTTCGACCTTCC 250 

INI MM II Mill i II Mill 

160 TCAGACCTATGGAAACTACTTCCTGAAAACAAC . GTTCTGTCCCCCTTGC 208 
2 51 CCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGTGGCACGGATTCCAGCA 300 

I Mill I I! Ill II 1 1 I II I I 

2 09 CGTCCCAAGCGGTGGATGATTTGATGCTCTCTCCGGATGATCTTGCACAA 258 
301 TGGACGTCTTCCACCTAGAGGGCATGACCACATCTGTCATGGCCCAGTTC 3 50 

, Ml I I II Mill II 

259 TGG TTAACTGAAGACCCAGGTC 280 

351 AATTTGCTGAGCAGCACCATGGACCAGATGAGCAGCCGCGCTGCCTCGGC 400 

I M I I Mill ill || || | 

281 CAGATGAAGCTC CCAGAATGTCAGAGGCTGCTCCCCACA 319 

401 CAGCCCGTACACCCCGGAGCACGCCGCCAGCGTGCCCACCCATTCACCCT 450 

MM i 1 1 M I I II Ml lit I III 

3 20 TGGCCCCCACACCAGCAGCTCCTACACCGGCGGCCCCTGCACCAGCCCC. 36 8 
451 ACGCACAGCCCAGCTCCACCTTCGACACCATGTCGCCCGCGCCTGTCATC 500 

Ml II I I I II 

369 CTCCTGGCCCCTGTCATCCTCTGTC 393 

501 CCCTCCAACACCGACTATCCCGGACCCCACCACTTCGAGGTCACTTTCCA 550 

I! Ill M Ml I III I II IN I MM 

394 CCTTC C C AGAAAAC CT AC C ACGGCAGCTACGGTTTC CGTCTGGGCTTCCT 443 
551 GCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCACTCTTGA 600 

II I M I M IMMMII II II IMMIillll III 

44 4 GCATTCTGGAACAGCCAAGTCTGTGACTTGCACGTACTCCCCTGACCTCA 493 
601 AG AAACTCTACTGC CAGATCGCCAAGACATGC CCCATCCAGATCAAGGTG 650 
494 IJUwaUvI-L.^tJ^ 543 
651 TCCGCCCCACCGCCCCCGGGCACCGCCATCCGGGCCATGCCTOTCTACAA 700 

II Mil II II MM I I (III IMMI I MINI! 

544 GATTCCACACCCCOXrCCGGCAGCOX^rCCGCGCCATGGCCATCTAC^ 593 
701 GAAGGCGGAGCACGTGACCGACATCGTGAAGCGCTGCCCCAACCACGAGC 750 

I II I Mill MM II llllll MINIUM Mil Mil 

594 GCAGTCACAGCACATGACTGAGGTCGTGAGGCGCTGCCCCCACCATGAGC 643 
751 TCGGGAGGGACTTCAACGAAGGACAGTCTGCCCCAGCCAGCCACCTCATC 800 

II III M II Mill f II II Ml 

644 GCTGCTCAGAC AGC GATGGA CTGGCCCCTCCTCAACATCTTATC 687 

801 CGTGTGGAAGGCAATAATCTCTCGCAGTATGTGGACGACCCTGTCACCGG 850 

I I I M SI M I Ml I I Mill III III III 

688 CGAGTGGAAGGAAATTTCCGTGTGGAGTATTCGGATGACAGAAACACTTT 737 
851 CAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACAGAAT 900 

I f I II II MMIIMIIimiil I! Mil li -I II I 

738 TCGACATAGTGTGGTGGTGCCCTATGACXCGCCTGAGGTTGGCTCTGACT 7 87 
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901 TCACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGC 950 

I i I 1 1 1 1 1 1 1 liliill MIIINIIN! til ilil III 

78 8 GTACCACCATCCACTACAACTACATGTGTAAC^GTTCCTGC^TGGGCGGC 837 
951 ATGAACCG ACGGCCCATCCTCATCATCATCACCCTGGAGACGCGGG ATGG 1000 

MINIM MIIMIMMI NIMH MM !l| 

3 38 ATGAACCGGAGGCCCATCCTCACAATTATCACACTGGAAGACTCCAGTGG 3 87 
1001 GCAGGTGCTGGGCCGCCGGTCCTTCGAGGGCCGCATCTGCGCCTGTCCTG 1050 

I I Mill M Hi III! il I II MMIIIMI 

888 TAATCTACTGGGACGGAACAGCTTTGAGGTGCGAGTTTGTGCCTGTCCTG 937 
10 51 GCCGCGACCGAAAAGCCGATGAGGACCACTACCGGGAGCAGCAG<KXrrTG 1100 

i mm: mi ii ii i i ii j 

938 GGAGAGACCGGCGCACAGAGGAAGAGAATTTCC G 971 



1101 AATGAGAGCTCCGCCAAGAACGGGGCTGCCAGCAAGCGCGCCTTCAAGCA 1150 
INN III I I MINI I I || | 

97 2 CAAGAAAGGGGAGCCTTGCCACGAGCTGCCCCCTGGGAGCACTAAGCGAG 1021 

1151 GAGTCCCCCTGCCGTCCCCGCCCTGGGCCC . GGGTGTGAAGAAGCGGCGG 1199 

i i 1 1 1 ii in mi i nun i i i 

102 2 CACTGCCCAACAACACCAGCTCCTCTCCCCAGCCAAAGAAGAAACCACTG 1071 
120 0 CACGGAGACGAGGACACGTACTACCTGCAGGTGCGAGGCCGCGAGAACTT 12 49 

I MM I Ml III Ml I II II II Ml IN 

107 2 GATGGAGAATATTTCAC C CTTCAGATCCGCGGGCGTGAGCGCTT 1115 

12 50 CGAGATCCTGATGAAGCTGAAGGAGAGCCTGGAGCTGATGGAGTTGGTGC 1299 

MUM I IIMMI 111 I II 1 1 II I IN 

1116 CGAGATGTTCCGAGAGCTGAATGAGGCCTTGGAACTCAAGGA 1157 

1300 CGCAGCCGCTGGTAGACTCCTATCGGCAGCAGCAGCAGCTCCTACAGAGG 13 49 

ii mil ii i mm i inn ii iii 

115 8 TGCCCAGGCTGGGAAAGAGCCAGCGG . . GGAGCAGGGCTCACTCGAGCCA 1205 

13 50 CCGAGTCACCTACAGCCCCCATCCTACGGGCCGGTCCTCTCGCCCATGAA 1399 

II MM Mllll III 

1206 CCTGAAGTCCAAGAAG<XXXIAATCTACCTCCCGCCATAAAAAATTCATGT 1255 

1400 CAAGGTGCACGGGGGCGTGAACAAGCTGCCCTCCGTCAACCAGCTGGTGG 144 9 

M MM II MM II MM] I I 

1256 TCAAGACAGAGGGGCCTGACTCAGACTGACATTC TCAGCTTCTTG 1300 

1450 GCCAGCCTCCCCCGCACAGCTCGGCAGCTACACCC^^ 1499 

I III II I III! MM II I Ml 

1301 TTCCCCCACTGAGCCTCCCACCCCCATCT . CTCCCIXXCCTGCCATTTTG 1349 
1500 GGCTCTGGGATGCTCAACAACCACGGCCACGCAGTGCCAGCCAACAGCGA 1549 

I MINI I III I II III I II III I 

1350 AGTTCTGGGTCITTAAACC C TT G CTT G CAATAGGTGTGTGTCAGAAGCAA 1399 
1550 GATGACCAGCAGCCACGGCACCCAGTCCATGGTCTCGGGGTCCCACTGCA 1599 
1400 A 1400 



FIG.1 cont. 
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1 MAQSTTTSPDGGTTFEHLWSSLEPDSTYFDLPQSSRGNNEVVGGTDSSMD 50 

..|:::..:|. :| :.|: || . .:|| : . .. .:( 

1 MEEPgSDPSIEPPLS. . . . QETFSDLWKLLPENNVLSPLPSQAVD 41 

51 VFHLEGMTTSVMAQFNIX S STMDQMSSRAAS AS PYTPEHAASVPTHS P YA 100 

= I ••■ = 11 = ••• h !•• .|..||.-|. -I : 

42 DLML . . . SPDDLAQWLTEDPGPDEAPRMSEAAPHMAPTPAAPTPA. APAP 87 

ioi qpsstfd™spapvipsntdypgphhfevtfqqsstaksatwtyspllkk 150 

-II--: •- :||.--|.|.. I :.| = I H M I - i III! 1-1 

8 8 APSWPL SSSVPSQKTYHGSYGFRLGFLHSGTAKSVTCTYSPDLNK 13 2 

151 LYCQIAKTCPIQIKVSAPPPPGTAIRAMPVYXKAEHVTDIVKRCPNHELG 200 

: :|hllllhh.|...|!l|. :H|::||..:|:|::|:|||:|| 
133 MFCQLAXTCPVQLWVDSTPPPGSRVRAMAIYKQSQHMTEWRRCPHHE . . 180 

201 RDFNEGQSAPASHLIRVEGNNLSQYVDDPVTGRQSVWPYEPPQVGTEFT 250 

I !h llllllll =| !!• I I = I i I I I I I I I = I I • : | 

181 RC S D S DGLA P PQ HL I R VEGNLRVEY SDDRNTFRH SVWP YEP PEVGS DCT 230 

251 TILYNFMCNSSCVGG>INRRPILIIITLETRDG<3VLGRRSFEGRICACPGR 300 

I I lhlMllhlMIMIIi-IIMI---!-:||l-l!l-|:|||||| 
231 TIHYNYMCNSSCMGGhlNRKPILTIITLEDSSGNLLGRNSFEVRVCACPGR 280 

301 DRKADEDHYREC^AI^SSAKNGAASKRAFXQSPPAVPALGPGVKKRRHG 3 50 

||:-:h::|... • •^•llh ]• -|::. 

281 DRRTEEENFRXKG . . EPCHELPPGSTKRALPNNTSSSPQ PKKXPL 323 

3 51 DEOTYYLQVRGRENFE ILMKLKESLELHELVPQPLVDSYRQQQQLLiQRPS 400 

I:: : I I = I I I I • I I : : - • I - I ■ I I I • : :•• ••: !•= !-••• 
324 DGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPAGSRAHSSHLKSKX 37 3 

401 HLQPPSYGPVLSPMNKVHGGVNKLPSVNQLVGQPPPHSSAATPNLGPVGS 450 

374 GQSTSRHXXFMFXTEGPDSD 393 



FIG. 2 
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1 TGCCTCCCCGCCCGCGCACCCGCCCCGAGGCCTGTGCTCCTGCGAAGGGG 5 0 

!i!Ill!!IIIM!lll!l!illlll!!!t!lll!lll!l!!!l!ljlli 

1 TGCCTCCCCGCCCGCGCACCCGCCCCGAGGCCTGTGCTCCTGCGAAGGGG 50 
51 ACGCAGCGAAGCCGGGGCCCGCGCCAGGCCGGCCGGGACGGACGCCGATG 100 

M!Mllil!l!l!ljMM!!!IIMlM!Ml!!ll!lllll!NI!l 

51 ACGCAGCGAAGCCGGGGCCCGCGCCAGGCCGGCCGGGACGGACGCCGATG 100 

101 CCCGGAGCTGCGACGGCTGCAGAGCGAGCTGCCCTCGGAGGCCGGTGTGA 150 

I i If I I I I I I I I I t II I I I I ! I I [ ] M I I I I I I I I II I I I I I I ! I I I I ! I 
101 CCCGGAGCTGCGACGGCTGCAGAGCGAGCTGCCCTCGGAGGCCGGTGTGA 150 

1 5 1 GGAAGATGGCCCAGTCCACCACCACCTCCCCCGATGGGGGCACCACGTTT 2 00 

MI!IM!!lilMMM!!IMIIiimiM!imil!IIMIMM 

151 GGAAGATGGCCCAGTCCACCACCACCTCCCCCGATGGGGGCACCACGTTT 2 00 
2 01 GAGCACCTCTGGAGCTCTCTGGAACCAGACAGCACCTACTTCGACCTTCC 2 50 

IIMIIMIIlMMilllllMIIIIIMIMIIINIIIMMIIM! 

2 01 GAGCACCTCTGGAGCTCTCTGGAACCAGACAGCACCTACTTCGACCTTCC 2 50 
2 51 CCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGTGGCACGGATTCCAGCA 3 0C 

'HMMMMlMIMHMimilMMimilimmiimi 

2 51 CCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGTGGCACGGATTCCAGCA 3 00 

3 01 TGGACGTCTTCCACCTAGAGGGCATGACCACATCTGTCATGGCCCAGTTC 3 5 0 

i ! ! I I I I I I I M I 1 I I II 1 I I I i I I ! II ! II I I I I I I II I I I I 1 1 I II I I 

301 TGGACGTCTTCCACCTAGAGGGCATGACCACATCTGTCATGGCCCAGTTC 3 50 
351 AATTTGCTGAGCAGCACCATGGACCAGATGAGCAGCCGCGCTGCCTCGGC 400 

iillMIIMIIIIIIlllMIIMIIMMIIIIinillllllIMM 

3 51 AATTTGCTGAGCAGCACCATGGACCAGATGAGCAGCCGCGCTGCCTCGGC 400 

4 01 CAGCCCGTACACCCCGGAGCACGCCGCCAGCGTGCCCACCCATTCACCCT 4 50 

illMMIIIMMIIilMIIIMMIMIIIIIIIMIIMMIMII 

4 01 CAGCCCGTACACCCCGGAGCACGCCGCCAGCGTGCCCACCCATTCACCCT 4 5 0 



501 CCCTCCAACACCGACTATCCCGGACCCCACCACTTCGAGGTCACTTTCCA 550 

iniMIIIIMMMIIMIIMIMMINIIIMIMIIIIIMMI 

501 CCCTCCAACACCGACTATCCCGGACCCCACCACTTCGAGGTCACTTTCCA 550 
551 GCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCACTCTTGA 600 

: l I N I I I I II 1 I I I II I i I I I I I I I I I I I I I I I II I I I I I M I M I II 

551 GCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCACTCTTGA 600 
601 AGAAACTCTACTGCCAGATCGCCAAGACATGCCCCATCCAGATCAAGGTG 650 

MIMMIIIMMMIMMinillMllllllllllillllllllll 

601 AGAAACTCTACTGCCAGATCGCCAAGACATGCCCCATCCAGATCAAGGTG 650 

6 51 TCCGCCCCACCGCCCCCGGGCACCGCCATCCGGGCCATGCCTGTCTACAA 700 

lllliMlllllNIIMllMIMIIIMIilllMMIIIMMIIII 

651 TCCGCCCCACCGCCCCCGGGCACCGCCATCCGGGCCATGCCTGTCTACAA 700 

7 01 GAAGGCGGAGCACGTGACCGACATCGTGAAGCGCTGCCCCAACCACGAGC 750 

mmmMMMiimimmmmiimiMminiM 

7 01 GAAGGCGGAGC ACGTGAC CGACATCGTGAAGCGCTGCCCCAACCACGAGC 750 
7 51 TCGGGAGGGACTTCAACGAAGGACAGTCTGCCCCAGCCAGCCACCTCATC 800 

MIMIMIIIIIIMHIMMMMIIIMIIIIIIINIIIMIIII 

751 TCGGGAGGGACTTCAACGAAGGACAGTCTGCCCCAGCCAGCCACCTCATC 800 
801 CGTGTGGAAGGCAATAATCTCTCGCAGTATGTGGACGACCCTGTCACCGG 850 

! M i J f i I! 1 1 1 i'l i 1 i 1 1 1 1 M 1 1 M I ! 1 1 ! 1 1 ! I ( 1 1 1 1 ! M 1 1 1 M 1 

801 CGTGTGGAAGGCAATAATCTCTCGCAGTATGTGGACGACCCTGTCACCGG 850 



FIG. 3 



8 51 CAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACAGAAT 900 

iiiiiiiiiiuiiiiiiiiiMMiiMiiiiiiiiiiiiiiiiiim rnnt 

851 CAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACAGAAT 90& V^WI I l . 
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901 TCACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGC 950 

Bn INillMllllimillMllllMIMIIIIIIIMMIIIINIIi 

901 TCACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGC 95 0 
951 ATGAACCGACGGCCCATCCTCATCATCATCACCCTGGAGACGCGGGATGG 1000 

oe , lilllHIIMiMIIIMillllijnillllllllilillliliiM! 

951 ATGAACCGACGGCCCATCCTCATCATCATCACCCTGGAGACGCGGGATGG 1000 
1001 GC AGGTGCTGGGCCGC CGGTCCTTCGAGGGCCGCATCTGCGCCTGTCCTG 1050 

,„„ MlllimiiMIIIIIIIINMIIIMIIIIIIIillMIIIIMI! 

1001 GCAGGTGCTGGGCCGCCGGTCCTTCGAGGGCCGCATCTGCGCCTGTCCTG 1050 
1051 GCCGCGACCGAAAAGCCGATGAGGACCACTACCGGGAGCAGCAGGCCTTG 1100 

, Ae Mlli!N!!IIM!!lllllllll[IMIIII!!]||j||illl!ljM 

1051 GCCGCGACCGAAAAGCCGATCAGCACCACTACCCGGAGCAGCAGGCCTTG 1100 
1101 AATGAGAGCTCCGCCAAGAACGGGGCTGCCAGCAAGCGCGCCTTCAAGCA 1150 

lini MMIMimillllllilllllllllllllHIMIIlllliiiijii 

1101 AATGAGAGCTCCGCCAAGAACGGGGCTGCCAGCAAGCGCGCCTTCAAGCA 1150 

1151 gagtccccctgccgtccccgccctgok:ccgggtgtgaagaagcggcggc 1200 

1151 GAGTCCCCCTrcCGTCCCCcUcC^^ 120 0 
12 01 ACGGAGACGAGGACACGTACTACCTGCAGGTGCGAGGCCGCGAGAACTTC 1250 

iMII[l!IIIIIIMIIM!!M!!!lillli!!liliniil|M||i 

1201 ACGGAGACGAGGACACGTACTACCTGCAGGTGCGAGGCCGCGAGAACTTC 12 50 
12 51 GAGATCCTGATGAAGCTGAAGGAGAGCCTGGAGCTGATGGAGTTGGTGCC 1300 

,, c llllimiliimi!MlM[lll!iMlllilli||INIMj|!!| 

12 51 GAGATCCTGATGAAGCTGAAGGAGAGCCTGGAGCTGATGGAGTTGGTGCC 1300 

13 01 GCAGC CGCTGGT AGACTCCTATCGGCAGC AGCAGCAGCTCCTACAGAGGC 13 50 

IMM!lllinni!MI!!I!!MIII!llllllfli!l!|INilN 

1301 GCAGCCGCTGGTAGACTCCTATCGGCAGC AGCAGCAGCTCCTACAGAGGC 13 50 



MMIIillllllllllllllllllillllllllllMllllllTinT, 

13 51 CGAGTCACCTACAGCCCCCATCCTACGGGCCGGTCCTCTCGCCCATGAAC 
1401 AAGGTGCACGGGGGCGTGAACAAGCTGCCCTCCGTCAACCAGCTGGTGGG 

!l!li!IM!MIII!ll!!;iMIII|jl!lllll!|l!ll!l!!f||' 

1401 AAGGTOCACGGGGGCCTGAACAAGCTGCCCTCCGTC A ACCAGCTGGT GG G 

14 51 CCAC^CTCCCCCGCACAGCTCGC^AGCTACACCCAACCTGGGACCTGTGG 

IlillllMMNIIIirMIIlllMIMIIINIllliMllini!! 
1451 ccagcctcccccgcacagctcg<x:agctacacccaacctgggacctgtgg 

1501 GCTCTGGGATGCTCAACAACCACG<KCACGCAGTGCCAGCCAACAGCGAG 

illllilllllltllllllllilllllMiniiljlllllMIIMlll 

1501 GCTCT<XWATGCTCAAC^ACCAC(XXXACOC*GTG^ 

1551 ATGACCAGCAGCCACGGCACCCAGTCCATGGTCTCGGGCTCC^ACTGCAC 

MIMMMMII MIMIII MMMIMMIIMMHMIMIMM 

1551 ATGACCAGCAGCCACCGCACCCAGTCCATXXSTCTCGGGGTCCCACTGCAC 
1601 TCCGCCACCCCCCTACCACGCa^CCaU3CCTCGTCAGTTTTTTAAa« 

Mill l!M I Mill IMMIMMMIIIMIMI 
1601 tccgccacccccctaccacgccgaccccagcctcgtc 



17 01 agcatttaccacctgcagaacctgaccatcgaggacctgggggccctgaa 

IIMMIM II I lit ! I L 

1638 AOGACCTGGGGGCCCTGAA 



1750 
1656 
1800 
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1657 GATCCCCGAGCAGTATCGCATGACCATCTGGCGGGGCCTGCAGGACCTGA 17 06 
1801 AGCAGGGCCACGACTACGGCGCCGCCGCGCAGCAGCTGCTCCGCTCCAGC 1850 

iimmiimmMiiiiimmnimmmmmm 

17 07 AGCAGGGCCACGACTACGGCGCCGCCGCGCAGCAGCTGCTCCGCTCCAGC 17 56 
1851 AACGCGGCCGCCATTTCCATCGGCGGCTCCXGGGAGCTGCAGCGCCAGCG 1900 

iminiiMfiiiMMiiiMiiMiiiiiiiiiiiiiiiiMiin 

1757 AACGCGGC CGCCATTTCCATCGGC GGCTCCGGGGAGCTGCAGCGCCAGCG 18 06 
1901 GGTCATGGAGGCCGTGCACTTCCGCGTGCGCCACACCATCACCATCCCCA 1950 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 ! || 1 1 1 1 1 i 1 1 1 1 1 1 [ 1 1 1 

1807 GGTCATGGAGGCCGTGCACTTCCGCGTGCGCCACACCATCACCATCCCCA 1856 
1951 ACCe^GGCGrcCCCGTC^ 200 o 
1857 ACCGCGGCGGCCCCGGCGCCGGCCCCGACGAGTGGGCGGACTTCGGCTTC 1906 
2 001 GACCTGCCCGACTGCAAGGCCCGCAAGCAGCCCATCAAGGAGGAGTTCAC 2 050 

, nillMIMIIMIIIIimilllillllllMIMilllMMliM 

1907 GACCTGCCCGACTGCAAGGCCCGCAAGCAGCCCATCAAGGAGGAGTTCAC 1956 
2 051 GGAGGCCGAGATCCACTGAGGGGCCGGGCCCAGCCAGAGCCTGTGCCACC 2100 

MimmmimMiiimmiimiimimiimM!! 

1957 GGAGGCCGAGATCCACTGAGGGGCCGGGCCCAGCCAGAGCCTGTGCCACC 2006 
2101 GCCCAGAC^CCC^ 2128 
2007 GCCCAGAGACCCAGGCCGCCTCGCTCTC 2034 
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1 TGCCTCCCCGCCCGCGCACCCGCCCCGAC3GCCTGTGCTCCTGCGAAGGGGACGCAGCGAA 60 

61 GCCGGGGCCCGCGCCAGGCCGGCCGGGACGGACGCCGATGCCCGGAGCTGCGACGGCTGC 120 

121 AGAGOGAGCTGCCCTCGGAGGCCGGTGTGAGGAAGATCGCCCAGTCCACCACCACCTCCC 180 

-10 MAQSTTTSP9 

181 CCGATGGGGGCACCACGTTTGAGCACCTCTGGAGCTCTCTt3GAACCAGACAGCACCTACT 240 

10 DGGTTFEHLWSSLEPDSTYF 29 

2 41 TCGACCTTCCCCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGTGGCACGGATTCCAGCA 3 00 
30 DLPQSSRGNNEVVGGTDSSM 49 

3 01 TGGACGTCTTCCACCTAGAGGGCATGACCACATCTGTCATGGCCCAGTTCAATTTGCTGA 3 60 
50 DVFHLEGMTTSVMAQFNLLS 69 

3 61 GCAGCACCATGGACCAGATGAGCAGCCGCGCTGCCTCGGCCAGCCCGTACACCCCGGAGC 420 
70 STMDQMSSRAASASPYTPEH 89 

421 ACGCCGCCAGCGTGCCCACCCATTCACCCTACGCACAGCCCAGCTCCACCTTCGACACCA 480 

90 AASVPTHSPYAQPSSTFDTM 109 

4 81 TGTCGCCCGCGCCTGTCATCCCCTCCAACACCGACTATCCCGGACCCCACCACTTCGAGG 540 
110 SPAPVIPSNTDYPGPHHFEV 129 
541 TCACTTTCCAGCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCACTCTTGA 600 
130 TFQQSSTAKSATWTYSPL.LK 149 

6 01 AGAAACTCTACTGCCAGATCGCCAAGACATGCCCCATCCAGATCAAGGTGTCCGCCCCAC 660 
150 KLYCQIAKTCPIQIKVSAPP 169 
661 CGCCCCCGGGCACCGCCATCCGGGCCATGCCTGTCTACAAGAAGGCGGAGCACGTGACCG 720 
170 PPGTAI RAMPVYKKAEHVTD 189 

7 21 ACATCGTGAAGCGCTGCCCCAACCACGAGCTCGGGAGGGACTTCAACGAAGGACAGTCTG 7 80 
190 IVKRCPNHELGRDFNEGQ.SA 209 
7 81 CCCCAGCCAGCCACCTCATCCGTGTGGAAGGCAATAATCTCTCGCAGTATGTGGACGACC 840 
210 PASHL IRVEGNNLSQYVDDP 229 
841 CTGTCACCGGCAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACAGAAT 900 
230 VTGRQSVVVPYEPPQVGTEF 249 
901 TCACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGCATGAACCGAC 960 

250 ttilynfmcnsscvggmnrr 269 

961 g<kxcatcctcatcatcatc*ccctggacacgcgggatggg^ 1020 

270 pilii itletrdgqvlgrrs 289 

1021 ccttcgagggccgcatctgcgcctgtcctggccgcgaccgaaaagccgatgaggaccact 1080 

290 fegr icacpgrdrkadedhy 309 

10 81 accgggagcagcaggccttg aatgagagctccgccaagaacggggctgccagcaagcgcg 1140 

310 reqqalnessakngaaskra 329 

1141 ccttcaagcagagtccccctgccgtccccgccctgggcccgggtgtgaagaagcggcggc 1200 

330 fkqsppavpalgpgvkxrrh 349 

1201 acggagacgaggacacgtactacctgcacgtgcga<xx:cgcgagaactn^agatcctga 1260 

350 gdedtyylqvrgremfeii.m 369 

1261 TGAAGCTGAAGGAGAGCCTGGAGCTGATGGAGTTGGTGCCGCAGCCGCTGCTAGACTCCT 1320 

370 KLKESLELKELVPQPLVDSY 389 

1321 ATCGGCAGCAGCAGCAGCTCCTACAGAGGCCGAGTCACCTACAGCCCCCATCCTACGGGC 1380 

390 RQQQQLLQRPSHLQPPSYGP 409 

410 VLSPMNXVHGGVNKLPSVNQ 429 

1441 AGCTGGTpGGCCAGCCTCCCCa^CAGCTCGGCAG^ 1500 

430 LVGQP P PHSSAATPNLGPVG 449 

1501 GCTCTGGGATGCTCAACAACCACGGCCACGCAGTGCCAGCCAACAGCGAGATGACCAGCA 1560 

450 SGMLNNHGHAVPANSEMTSS 469 

1561 GCCACGGCACCCAGTCCATGGTCTCGGGGTCCCACTGCACTCCGCCACCCCCCTACCACG 1620 

470 HGTQSMVSGSHCTPPPPYHA 489 

1621 CCGACCCCAGCCTCGTCAGTTTTTTAACAGGATTGCGGTGTCCAAACTGCATCGAGTATT 1680 

490 DPSLVSFLTGLGCPNCIEYP 509 
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550 
1361 

570 
1921 

590 
1981 

610 
2041 

630 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 



TCACGTCCCAGGGGTTACAGAGCATTTACCACCTGCAGAACCTGACCATCGAGGACCTGG 

TSQGLQSIYHLQNLTIEDLG 
GGGCCCTGAAGATCCCCGAGCAGTATCGCATGACCATCTGGCGGGGCCTGCAGGACCTGA 

ALK I PEQYRMTIWRGLQDLK 
AGCAGGGCCACGACTACGGCGCCGCCGCGCAGCAGCTGCTCCGCTCCAGCAACGCGGCCG 

QGHDYGAAAQQLLRSSNAAA 
CCATTTCCATCGGCGGCTCCGGGGAGCTGCAGCGCCAGCGGGTCATGGAGGCCGTGCACT 

I S IGGSGELQRQRVMEAVHF 
TCCGCGTGCGCCACACCATCACCATCCCCAACCGCGGCGGCCCCGGCGCCGGCCCCGACG 

RVRHT I T I PNRGGPGAGPDE 
AGTGGGCGGACTTCGGCTTCGACCTGCCCGACTGCAAGGCCCGCAAGCAGCCCATCAAGG 

WAD.FGFDLPDCKARKQPIKE 
AGGAGTTCACGGAGGCCGAGATCCACTGAGGGGCCGGGCCCAGCCAGAGCCTGTGCCACC 

EFTEAEIH* 
GCCCAGAGACCCAGGCCGCCTCGCTCTCCTTCCTGTGTCCAAAACTGCCTCCGGAGGCAG 
GGCCTCCAGGCTGTGCCCGGGGAAAGGCAAGGTCCGGCCCATGCCCCGGCACCTCACCGG 
CCCCAGGAGAGGCCCAGCCACCAAAGCCGCCTGCGGACAGCCTGAGTCACCTGCAGAACC 



CACTGCCGGGCGTGCTCCATGGCAGGCGTGGGTGGGGACCGCAGTGTCAGCTCCGACCTC 



AATCCTCTTCGCTGGTGGACTGCCAAAAAGTATTT TG CGACAT 
TGGTGAGCAGCCAAGCGACTGTGTCTGAAACAC CGTGCATTTTCAGGGAATGTCCCTAAC 
GGGCTGGGGACTCTCTCTGCTGGACTTGGGAGTGGCCTTTGCGCCCAGCACACTGTATTC 
TGCGG<^CCGCCTCCTrcCTGCCCCTAACAACCACCAAAGTGTTGCTGAAATTGGAGAAA 
ACTGGGGAAGGCGCAACCCCTCCCAGGTGCGGGAAGCATCTGGTACCGCCTCGGCCAGTG 
CCCCTCAGCCTGGCCACAGTCACCTCTCCTTGGGGAACCCTGGGCAGAAAGGGACAGC^ 
GTCCTTAGAGGACCGGAAATTGTCAATATTTGATAAAATCATACCCTTTTCTAC 2874 



1740 
529 
1800 
549 
1860 
569 
1920 
589 
1980 
609 
2040 
629 
2100 
649 
2160 
2220 
2280 
2340 
2400 
2 460 
2520 
2580 
2640 
2700 
2760 
2820 
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61 GCCGGGGCCCGCGCCAGGCCGGCCGGGACGGACGCCGATGCCCGGAGCTGCGACGGCTGC 120 

121 AGAGCGAGCTGCCCTCGGAGGCCGGTGTGAGGAAGATGGCCCAGTCCACCACCACCTCCC 180 

-10 MAQSTTTSP 9 

181 CCGATGGGGGCACCACGTTTGAGCACCTCTGGAGCTCTCTGGAACCAGACAGCACCTACT 2 40 

10 DGGTTFEHLWSSLKPDSTYF 29 

2 41 TCGACCTTCCCCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGTGGCACGGATTCCAGCA 300 
30 DLPQSSRGNNEVVGGTDSSH 49 

3 01 TGGACGTCTTCCACCTAGAGGGCATGACCACATCTGTCATGGCCCAGTTCAATTTGCTGA 3 60 
50 DVFHLEGMTTSVMAQFNLLS 69 

361 GCAGCACCATGGACCAGATGAGCAGCCGCGCTGCCTCGGCCAGCCCGTACACCCCGGAGC 420 

70 STMDQMSSRAASASPYTPEH 89 

421 ACGCCGCCAGCGTGCCCACCCATTCACCCTACGCACAGCCCAGCTCCACCTTCGACACCA 480 

90 AASVPTHSPYAQPSSTFDTM 109 

481 TGTCGCCCGCGCCTGTCATCCCCTCCAACACCGACTATCCCGGACCCCACCACTTCGAGG 540 

110 SPAPVI PSNTDYPGPHHFEV 129 

541 TCACTTTCCAGCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCACTCTTGA 600 ' 

130 TFQQ SSTAKSATWTYSPLLK 149 

601 AGAAACTCTACTGCCAGATCGCCAAGACATGCCCCATCCAGATCAAGGTGTCCGCCCCAC 660 

150 KLYCQIAKTCPIQIKVSAPP 169 

661 CGCCCCCGGGCACCGCCATCCGGGCCATGCCTGTCTACAAGAAGGCGGAGCACGTGACCG 720 

170 PPGTAIRAMPVYKKAEHVTD 189 

721 ACATCGTGAAGCGCTGCCCCAACCACGAGCTCGGGAGGGACTTCAACGAAGGACAGTCTG 7 80 

190 IVKRCPNHELGRDFNEGQSA 209 

7 81 CCCCAGCCAGCCACCTCATCCGTGTGGAAGGCAATAATCTCTCGCAGTATGTGGACGACC 840 

210 PASHLIRVEGNNLSQYVDDP 229 

841 CTGTCACCGGCAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACAGAAT 900 

230 VTGRQSVVVPYEPPQVGTEF 249 

901 TCACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGCATGAACCGAC 960 

250 TTILYNFMCNSSCVGGMNRR 269 

961 GGCCCATCCTCATCATCATCACCCTGGAGACGCGGGATGGGCAGGTGCTGGGCCGCCGGT 1020 

270 PI LI I ITLETRDGQVLGRRS 289 

1021 CCTTCGAGGGCCGCATCTGCGCCTGTCCTGGCCGCGACCGAAAAGCCGATGAGGACCACT 1080 

290 FEGR ICACPGRDRKADEDHY 309 

1081 ACCGGGAGCAGCAGGCCTTGAATGAGAGCTCCGCCAAGAACGGGGCTGCCAGCAAGCGCG 1140 

310 REQQALNESSAKNGAASKRA 329 

1141 CCTTCAAGCAGAGTCCCCCTGCCGTCCCCGCCCTGGGCCCGGGTGTGAAGAAGCGGCGGC 1200 

330 FKQS PPAVPALGPGVKKRRH 349 

1201 ACGGAGACGAGGACACGTACTACCTGCAGGTGCGAGGCCGCGAGAACTTCGAGATCCTGA 1260 

350 GDEDTYYLQVRGRBNFEILM 369 

1261 TGAAGCTCAAGGAGAGCCTGGAGCTGATGGAGTT G GT GC CGCAGCCGCTGCTAGACTCCT 1320 

370 KL.KE St.EL.MELVPQPl.VDSY 389 

13 21 ATCGGCAGCAGCAGCAGCTCCTACAGAGGCCGAGTCACCTACAGCCCCCATCCTACGGGC 1380 

390 RQQQQLLQRPSHLQ.PPSYGP 409 

1381 CGGTCCTCTCGCCCATGAACAAGGTGCACGGGGGCGTGAACAAGCTGCCCTCCGTCAACC 14 40 

410 VL S PMNKVHGGVNKLPSVNQ 429 

1441 AGCTGGTtKXKCAGCCrCCCCCGCACAGCTCGGCAGCTACACCCAACCTGGGACCTGTGG 1500 

430 LVGQ PPPHSSAATPNLGPVG 449 

1501 GCTCTGGGATGCTCAACAACCACGGCCACGCACTXKCAGCCAACAGCGAGATGAeCAGCA 1560 

450 SGMLNNHGHAVPAHSEMTSS 469 

1561 GCCACGGCACCCAGTCCATGGTCTCGGGGTCCCACTGCACTCCGCCACCCCCCTACCACG 1620 

470 HGTQSMVSGSHCTPPPPYHA 489 

1621 CCGACCCCAGCCTCGTCAGGACCTGGGGGCCCTGAAGATCCCCGAGCAGTATCGCATGAC 1680 

490 DPSLVRTWGP* 509 

1681 CATCTGGCGGGGCCTGCAGGACCTGAAGCAGGGCCACGACTACGGCGCCGCCGCGCAGCA 1740 

17 41 GCTGCTCCGCTCCAGCAACGCGGCCGCCATTTCCATCGGCGGCTCCGGGGAGCTGCAGCG 1800 

1801 CCAGCGGGTCATGGAGGCCGTGCACTTCCGCGTGCGCCACACCATCACCATCCCCAACCG 1860 

1861 (^XXTGGCCCCGGCGCCGGCCCCGACGAGTGGGCGGACTTCGGCTTCGACCTGCCCGACTG 1920 

1921 CAAGGCCCGCAAGCAGCCCATCAAGGAGGAGTTCACGGAGGCCGAGATCCACTGAGGGGC 1980 

1981 CGGGCTrCAGCCAGAGCCroTGCCACCGCCCACAGACCCAGGCCGCCTC G CTCTC 2034 
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1 


GCGAGCTGCCCTCGGAGGCCGGCGTGGGGAAGATGGCCCAGTCCACCGCCACCTCCCCTG 


60 


-9 


MAQSTATSPD 


10 


61 


ATGGGGGCACCACGTTTGAGCACCTCTGGAGCTCTCTGGAACCAGACAGCACCTACTTCG 


120 


11 


GGTTFEHLWSSLEPDSTYFD 


30 


121 


ACCTTCCCCAGTCAAGCCGGGGGAATAATGAGGTGGTGGGCGGAACGGATTCCAGCATGG 


180 


31 


LPQSSRGNNEVVGGTDSSMD 


50 


181 


ACGTCTTCCACCTGGAGGGCATGACTACATCTGTCATGGCCCAGTTCAATCTGCTGAGCA 


240 


51 


VFHLEGMTTSVMAQFNLLSS 


70 


241 


GCACCATGGACCAGATGAGCAGCCGCGCGGCCTCGGCCAGCCCCTACACCCCAGAGCACG 


300 


71 


TMDQMSSRAASASPYTPEHA 


90 


301 


CCGCCAGCGTGCCCACCCACTCGCCCTACGCACAACCCAGCTCCACCTTCGACACCATGT 


360 


91 


ASVPTHS PYAQPSSTFDTMS 


110 


361 


CGCCGGCGCCTGTCATCCCCTCCAACACCGACTACCCCGGACCCCACCACTTTGAGGTCA 


420 


111 


PAPVIPSNTDYPGPHHFEVT 


130 


421 


CTTTCCAGCAGTCCAGCACGGCCAAGTCAGCCACCTGGACGTACTCCCCGCTCTTGAAGA 


480 


131 


FQQSSTAKSATWTYSPLLKK 


150 


481 


AACTCTACTGCCAGATCGCCAAGACATGCCCCATCCAGATCAAGGTGTCCACCCCGCCAC 


540 


151 


LYCQIAKTCPIQIKVSTPPP 


170 


541 


CCCCAGGCACTGCCATCCGGGCCATGCCTGTTTACAAGAAAGCGGAGCACGTGACCGACG 


600 


171 


PGTAIRAMPVYKKAEHVTDV 


190 


601 


TCGTGAAACGCTGCCCCAACCACGAGCTCGGGAGGGACTTCAACGAAGGACAGTCTGCTC 


660 


191 


VKRC PNHELGRDFNEGQSAP 


210 


661 


CAGCCAGCCACCTCATCCGCGTGGAAGGCAATAATCTCTCGCAGTATGTGGATGACCCTG 


720 


211 


ASHLIRVEGNNLSQYVDDPV 


230 


721 


TCACCGGCAGGCAGAGCGTCGTGGTGCCCTATGAGCCACCACAGGTGGGGACGGAATTCA 


780 


231 


TGRQSVVVPYEPPQVGTEFT 


250 


781 


CCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTAGGGGGCATGAACCGGCGGC 


840 


251 


TILYNFMCNSSCVGGMNRRP 


270 


841 


CCATCCTCATCATCATCACCCTGGAGATGCGGGATGGGCAGGTGCTGGGCCGCCGGTCCT 


900 


271 


ILI I I TLEMRDGQVLGRRSF 


290 


901 


TTGAGGGCCGCATCTGCGCCTGTCCTGGCCGCGACCGAAAAGCTGATGAGGACCACTACC 


960 


291 


EGRICACPGRDRKADEDHYR 


310 


961 


GGGAGCAGCAGGCCCTGAACGAGAGCTCCGCCAAGAACGGGGCCGCCAGCAAGCGTGCCT 


1020 


311 


EQQALNESSAKNGAASKRAF 


330 


1021 


TCAAGCAGAGCCCCCCTGCCGTCCCCGCCCTTGGTGCCGGTGTGAAGAAGCGGCGGCATG 


1080 


331 


KQS PPAVPALGAGVKKRRHG 


350 


1081 


GAGACGAGGACACGTACTACCTTCAGGTGCGAGGCCGGGAGAACTTTGAGATCCTGATGA 


1140 


351 


DEDTYYLQVRGRENFEI LMK 


370 


1141 


AGCTGAAAGAGAGCCTGGAGCTGATGGAGTTGGTGCCGCAGCCACTGGTGGACTCCTATC 


1200 


371 


LKESLELMELVPQPLVDSYR 


390 


1201 


GGCAGCAGCAGCAGCTCCTACAGAGGCCGAGTCACCTACAGCCCCCGTCCTACGGGCCGG 


1260 


391 


QQQQLLQRPSHLQPPSYGPV 


410 


1261 


TCCTCTCGCCCATGAACAAGGTGCACGGGGGCATGAACAAGCTGCCCTCCGTCAACCAGC 


1320 


411 


LSPMNKVHGGMNKLPSVNQL 


430 


1321 


TGGTGGGCCAGCCTCCCCCGCACAGTTCGGCAGCTACACCCAACCTGGGGCCCGTGGGCC 


1380 


431 


VGQPPPHSSAATPNLGPVGP 


450 


1381 


CCGGGATGCTCAACAACCATGGCCACGCAGTGCCAGCCAACGGCGAGATGAGCAGCAGCC 


1440 


451 


GMLNNHGHAVPANGEMSSSH 


470 
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1441 ACAGCGCCCAGTCCATGGTCTCGGGGTCCCACTGCACTCCGCCACCCCCCTACCACGCCG 1500 

471 SAQSMVSGSHCTPPPPYHAD 490 

15 01 ACCCCAGCCTCGTCAGTTTTTTAACAGGATTGGGGTGTCCAAACTGCATCGAGTATTTCA 1560 

491 PSLVSFLTGLGCPNCIEYFT 510 

1561 CCTCCCAAGGGTTACAGAGCATTTACCACCTGCAGAACCTGACCATTGAGGACCTGGGGG 162 0 

511 SQGLQSIYHLQNLTIEDLGA 530 

1621 CCCTGAAGATCCCCGAGCAGTACCGCATGACCATCTGGCGGGGCCTGCAGGACCTGAAGC 168 0 

531 LKI PEQYRMTIWRGLQDLKQ 550 

1681 AGGGCCACGACTACAGCACCGCGCAGCAGCTGCTCCGCTCTAGCAACGCGGCCACCATCT 1740 

551 GHDYSTAQQLLRSSNAATIS 570 

17 41 CCATCGGCGGCTCAGGGGAACTGCAGCGCCAGCGGGTCATGGAGGCCGTGCACTTCCGCG 1800 

571 IGGSGELQRQRVMEAVHFRV 590 

1801 TGCGCCACACCATCACC ATCCCCAACCGCGGCGGCCCAGGCGGCGGCCCTGACGAGTGGG 1860 

591 RHTITI PNRGGPGGGPDEWA 610 

1861 CGGACTTCGGCTTCGACCTGCCCGACTGCAAGGCCCGCAAGCAGCCCATCAAGGAGGAGT 1920 

611 DFGFDLPDCKARKQPIKEEF 630 

1921 TCACGGAGGCCGAGATCCACTGAGGGCCTCGCCTGGCTGCAGCCTGCGCCACCGCCCAGA 1980 

631 T E A E I H * 650 

19 81 GACCCAAGCTGCCTCCCCTCTCCTTCCTGTGTGTCCAAAACTGCCTCAGGAGGCAGGACC 204 0 

2 041 TTCGGGCTGTGCCCGGGGAAAGGCAAGGTCCGGCCCATCCCCAGGCACCTCACAGGCCCC 2100 

2101 AGGAAAGGCCCAGCCACCGAAGCCGCCTGTGGACAGCCTGAGTCACCTGCAGAACC 2156 
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1 TGATCTCCCTGTGGCCTGCAGGGGACTGAGCCAGGGAGTAGATGCCCTGAGACCCCAAGG SO 

61 GACACCCAAGGAAACCTTGCTGtXnTTGAGAAAGGGATCGTCTCTCTCCTGCCC^iG^SA 120 

121 AGCATGTGTATGGGCCCTGTGTATGAATCCTTGGGGCAGGCCCAGTTCAATTTGCTCAGC inn 

0 MCMGPVYESLGQAQFNL L S H 

181 AGTGCCATGGACCAGATGGGCAGCCGTGCGGCCCCGGCGAGCCCCTACACCCCGGAGCAC tin 

20SAMDQMGSRAAPASPYTP Th 39 

241 GCCGCCAGCGCGCCCACCCACTCGCCCTACGCGCAGCCCAGCTCCACCTTCGACACCATG 3 00 

40AASAPTHSPYAQPSS7FDTM S9 

301 TCTCCGGCGCCTGTCATCCCTTCCAATACCGACTACCCCGGCCCCCACC\CTTCGAGGTC 3 60 

60SPAPVIPSNTDYPGPHHFE V 79 

3 61 ACCTTCCAGCAGTCGAGCACTGCCAAGTCGGCCACCTGGACATACTCCCCACTCTTGAAG 420 
aOTFQQSSTAKSATWTYSPLLK 99 

4 21 AAGTTGTACTGTC AGATTGCTAAGACATGCCCCATCCAGATCAAAGTGTCCACACCACCA 480 
100 KLYCQIAKTCPIQ1KVSTPP 119 

481 CCCCCGGGCACGGCCATCCGGGCCATGCCTGTCTACAAGAAGGCAGAGCATGTGACCGAC 540 

19fl P O n T a t o a vr r> w ^ - — _ 



120 _ 

5 41 ATTGTTAAGCGCTGCCCCAACCACGAGCTTGGAAGGGACTTCAATGAAGGACA 

140 IVKRCPNHELGRDFNEGQSA 

601 CCGGCTAGCCACCTCATCCGTGTAGAAGGCAACAACCTCGCCCAGTACGTGGATGACCCT 

160 PASHL I RVEGNNLAOYVnno 



661 

180 



GTCACCGGAAGGCAGAGTGTGGTTGTGCCGTATGAACCCCCACAGGTGGGAACAGAATTT 



200 
781 
220 
841 
240 



1021 
300 



400 



139 
600 
159 
660 
179 
720 

Q V G T E F 199 



721 ACCACCATCCTGTACAACTTCATGTGTAACAGCAGCTGTGTGGGGGGCATGAATCGGAGG 7flO 

inn T« T T T V kl =• \J vt €• r- _ .. ^ DU 



219 
840 
239 
900 

901 CGGGAGCAACAGGCTCTGAATGAAAGTACCACGAAAAATGGAG^^ 9^0 



260 REQQal,nb5TTKNGAASKRA 279 

961 TTCAAGCAGAGCCCCCCTGCCATCCCTGCCCTGGGTACCAACGTGAAGAAGAGACGCCAC 1020 

280 FKQSPPAIPALGTNVKKRRH 299 

GGGGACGAGGACATGTTCTACATGCACGTGCGAGGCCGGGAGAACTTTGAGATCTTGATG 1080 

GDEDMFYMHVRGRENFEILM 319 

1081 AAAGTC AAGG AG AGC CT AG AACTGATGGAGCTTGTGCCCCAGCC TTTOlj' riXjACTC CTAT 1140 

320 KVKESLELMELVPQPLVDSY 339 

1141 CGACAGCAGCAGCAGCAGCAGCTCCTACAGAGGCCGAGTCACCTGCAGCCTCCATCCTAT 1200 

340 RQQQQQQLLQRPSHLQPPSY 359 

1201 GGGCCCGTGCTCTCTCCAATGAACAAGGTACACGGTGGTGTCAACAAACTGCCCTCCGTC 1260 

360 GPVLS PMNKVHGGVNKLPSV 379 

1261 AACCAGCTGGTGGGCCAGCCTCCCCCGCACAGCTCAGCAGCTGGGCCCAACCTGGGGCCC 1320 

380 NQLVGQPPPHSSAAGPNLGP 399 

1321 ATCGGCTCCGGGATGCTCAACAGCCACGGCCACAGCATGCCGGCCAATGGTGAGATGAAT 1380 



419 



13 81 GGAGGCCACAGCTCCCAGACCATGGTTTCGGGATCCCACTGCACCCCGCCACCCCCCTAT 1440 

420 GGHSSQTMVSGSHCTPPPPY 439 

1441 C A TG C AG AC CC C AGC CTC GTC AGTTTTTTG AC AGGGTTGGGGTGTCCAAACTGCATCG AG 1500 

440 HADPSLVSFLTGLGCPHCIE 459 

1501 TGCTTCACTTC C C AAGGGTTGCAGAGCATCTACCACCTGCAGAACCTTACCATCGAGGAC 1560 

460 CFTSQGLQSIYHLQNLTIED 479 

1561 CTTGGGGCTCTGAAGGTCCCTGACCAGTACCGTATGACCATCTGGAGGGGCCTACAGGAC 1620 

480 LGALKVPDQYRMTIWRGLQD 499 

1621 CTGAAGCAGAGCCATGACTGCGGCCAGCAACTGCTACGCTCCAGCAGCAACGCGGCCACC 1680 

500 LKQSHDCGQQLLRSSSMAAT 519 

16 81 ATCTCCATCGGCGGCTCTGGCGAGCTGCAGCGGCAGCGGGTCATGGAAGCCGTGCATTTC 1740 

520 I S I GGSGEL.QRQRVMEAVHF 539 

1741 CGTGTGCGCCACACCATCACAATCCCCAACCGTGGAGGCGCAGGTGCGGTGACAGGTCCC 1800 

540 RVRHTITIPNRGGAGAVTGP 559 

1801 GACGAGTGGGCGGACTTTGGCTTTGACCTGCCTGACTGCAAGTCCCGTAAGCAGCCCATC 1860 

560 DEWADFGFDLPDCKSRKQPI 579 

1861 AAAG AGGAGTTC ACAGAG AC AG AG AGC CACTGAGGAACGTACCTTCTTCTCCTGTCCTTC 1920 

580 KEEFTETESH* 599 

1921 CTCTGTGAGAAACTGCTCTTGGAAGTGGGACCTGTTGGCTGTGCCCA 1980 

1981 GGACCTTCTGCCGGATGCCATTCCTGAAGGGAAGTCGCTCATGAACTAACTCCCTCTTGG 2040 
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1 TGGTCCCGCTTCGACCAAGACTCCGGCTACCAGCTTGCGGGCCCCGCGGAGGAGGAGACC 60 

61 CCGCTGGGGCTAGCTGGGCGACGCGCGCCAAGCGGCGGCGGGAAGGAGGCGGGAGGAGCG 12 0 

121 GGGCCCGAGACCCCGACTCGGGCAGAGCCAGCTGGGGAGGCGGGGCGCGCGTGCX3AGCCA 180 

181 GGGGCCCGGGTG<X:CGGCCCTCCTCCGCCACGGCTGAGTGCCCGCGCTGCCTTCCCGCCG 240 

241 GTCCGCCAAGAAAGGCGCTAAGCCTGCGGCAGTCCCCTCGCCGCCGCCTCCCTGCTCCGC 300 

3 01 ACCCTTATAACCCGCCGTCCCGCATCCAGGCGAGGAGGCAACGCTGCAGCCCAGCCCTCG 3 60 

3 61 CCGACGCCGACGCCCGGCCCGGAGCAGAATGAGCGGCAGCGTTGGGGAGATGGCCCAGAC 42 0 
-8 MSGSVGEMAQTll 

421 CTCTTCTTCCTCCTCCTCCACCTTCGAGCACCTGTGGAGTTCTCTAGAGCCAGACAGCAC 430 

12 SSSSSSTFEHLWSSLEPDST 31 

4 81 CTACTTTGACCTCCCCCAGCCCAGCCAAGGGACTAGCGAGGCATCAGGCAGCGAGGAGTC 540 
32 YFDLPQPSQGTSEASGSEES 51 

5 41 CAACATGGATGTCTTCCACCTGCAAGGCATGGCCCAGTTCAATTTGCTCAGCAGTGCCAT 600 
52 NMDVFHLQGMAQFNLLSSAM 71 

601 GGACCAGATGGGCAGCCGTGCGGCCCCGGCGAGCCCCTACACCCCGGAGCACGCCGCCAG 660 

72 DQMGSRAAPASPYTPEHAAS 91 

6 61 CGCGCCCACCCACTCGCCCTACGCGCAGCCCAGCTCCACCTTCGACACCATGTCTCCGGC 720 
92 APTHS PYAQPSSTFDTMSPA 111 

721 GCCTGTCATCCCTTCCAATACCGACTACCCCGGCCCCC 758 
112 PVIPSNTDYPGP 123 
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Name: 


sr 


-p70a-cos3 


Len : 


650 


Check: 


9661 


Weight: 


1 


00 


Name: 


sr 


-p70b-cos3 


Len : 


650 


Check: 


3605 


Weight: 


1 


00 


Name: 


sr 


-p70-ht29 


Len: 


650 


Check : 


85 


Weight : 


1 


00 




sr 


-p70c-atc20 




650 


Check: 


4072 


Weight : 


1 


00 






-p70a-atc20 




650 


Check: 


4204 


Weight: 


1 


00 



1 



50 



. sr-p70a-cos3 MAQ STTTSPDGGT TFEHLWSSLE PDSTYFDLPQ SSRGNNEWG 

. sr-p70b-cos3 MAQ STTTSPDGGT TFEHLWSSLE PDSTYFDLPQ SSRGNNEWG 

sr-p70-ht29 MAQ STATSPDGGT TFEHLWSSLE PDSTYFDLPQ SSRGNNEWG 

sr-p70c-att20 

sr-p70a-att20 MSGSVGEMAQ . . .TSSSSSS TFEHLWSSLE PDSTYFDLPQ PSQGTSEASG 

51 100 

GTDSSMD.VF HLEGMTTSVM AQFNLLSSTM DQMSSRAASA SPYTPEHAAS 

GTDSSMD.VF HLEGMTTSVM AQFNLLSSTM DQMSSRAASA SPYTPEHAAS 

GTDSSMD.VF HLEGMTTSVM AQFNLLSSTM DQMSSRAASA SPYTPEHAAS 

. . . MCMGPVY . . ESLG . . .Q AQFNLLSSAM DQMGSRAAPA SPYTPEHAAS 

SEESNMD.VF HLQGM AQFNLLSSAM DQMGSRAAPA SPYTPEHAAS 

101 150 

VPTHSPYAQP SSTFDTMSPA PVIPSNTDYP GPHHFEVTFQ QSSTAKSATW 

VPTHSPYAQP SSTFDTMSPA PVIPSNTDYP GPHHFEVTFQ QSSTAKSATW 

VPTHSPYAQP SSTFDTMSPA PVIPSNTDYP GPHHFEVTFQ QSSTAKSATW 

APTHSPYAQP SSTFDTMSPA PVIPSNTDYP GPHHFEVTFQ QSSTAKSATW 

APTHSPYAQP SSTFDTMSPA PVIPSNTDYP GP 

151 200 

TYSPLLKKLY CQIAKTCPIQ IKVSAPPPPG TAIRAMPVYK KAEHVTDIVK 

TYSPLLKKLY CQIAKTCPIQ IKVSAPPPPG TAIRAMPVYK KAEHVTDIVK 

TYSPLLKKLY CQIAKTCPIQ IKVSTPPPPG TAIRAMPVYK KAEHVTDWK 

TYSPLLKKLY CQIAKTCPIQ IKVSTPPPPG TAIRAMPVYK KAEHVTDIVK 



_ sr-p70a-cos3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



_ sr-p70a-cos3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



_ sr-p70a-cos3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



_ sr-p70a-cos3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



_ sr-p70a-cos3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



_ sr-p70a-coa3 
_ sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



201 250 
RCPNHELGRD FNEGQSAPAS HLIRVEGNNL SQYVDDPVTG rqsvwpyep 
RCPNHELGRD FNEGQSAPAS HLIRVEGNNL SQYVDDPVTG RQSVWPYEP 
RCPNHELGRD FNEGQSAPAS HLIRVEGNNL SQYVDDPVTG RQSVWPYEP 
RCPNHELGRD FNEGQSAPAS HLIRVEGNNL AQYVDDPVTG RQSVWPYEP 



251 300 
PQVGTEFTT I LYNFMCNSSC VGGMNRRPIL IIITLETRDG QVLGRRSFEG 
PQVGTEFTTI LYNFMCNSSC VGGMNRRPIL IIITLETRDG QVLGRRSFEG 
PQVGTEFTT I LYNFMCNSSC VGGMNRRPIL IIITLEMRDG QVLGRRSFEG 
PQVGTEFTTI LYNFMCNSSC VGGMNRRPIL VIITLETRDG QVLGRRSFEG 



301 350 
RICACPGRDR KADEDHYREQ QALNESSAKN GAASKRAFKQ SPPAVPALGP 
RICACPGRDR KADEDHYREQ QALNESSAKN GAASKRAFKQ SPPAVPALGP 
RICACPGRDR KADEDHYREQ QALNESSAKN GAASKRAFKQ SPPAVPALGA 
RICACPGRDR KADEDHYREQ QALNESTTKN GAASKRAFKQ SPPAIPALGT 



FIG. 9 



15/36 



09/125005 



_ sr-p70a-cos3 
sr-p70b-cos3 
_ sr-p70-ht29 
_sr-p70c-att20 
_sr-p70a-att20 



351 400 

GVKXRRHGDE DTYYLQVRGR ENFEILMKLK ESLELMELVP QPLVDSYR . . 

GVKXRRHGDE DTYYLQVRGR ENFEILMKLK ESL E L MEL VP QPLVDSYR. . 

GVKXRRHGDE DTYYLQVRGR ENFEILMKLK ESLELMELVP QPLVDSYR . . 

NVKKRRHGDE DMFYMHVRGR ENFEILMKVK ESLELMELVP QPLVDSYRQQ 



sr-p70a-cos3 QQQQLLQRPS HLQPPSYGPV LSPMNKVHGG VNKLPSVNQL VGQPPPHSSA 

sr-p70b-COS3 QQQQLLQRPS HLQPPSYGPV LSPMNKVHGG VNKLPSVNQL VGQPPPHSSA 

sr-p70-ht29 QQQQLLQRPS HLQPPSYGPV LSPMNKVHGG KNKLPSVNQL VGQPPPHSSA 

_sr-p70c-aCC20 QQQQLLQRPS HLQPPSYGPV LSPMNKVHGG VNKLPSVNQL VGQPPPHSSA 

_sr-p70a-att20 

~ 451 500 

sr-p70a-cos3 ATPNLGPVGS GMLNNHGHAV PANSEMTSSH GTQSMVSGSH CTPPPPYHAD 

sr-p70b-cos3 ATPNLGPVGS GMLNNHGHAV PANSEMTSSH GTQSMVSGSH CTPPPPYHAD 

sr-p70-ht29 ATPNLGPVGP GMLNNHGHAV PANGEMSSSH SAQSMVSGSH CTPPPPYHAD 

_sr-p70c-atc20 AGPNLGPMGS GMLNSHGHSM PANGEMNGGH SSQTMVSGSH CTPPPPYHAD 

_sr-p70a-att20 

Z 501 550 

Sr-p70a-cos3 PSLVSFLTGL GCPNCIEYFT SQGLQSIYHL QNLTIEDLGA LKIPEQYRMT 

sr-p70b-cos3 PSLVR..T.W G.P 

sr-p70-ht29 PSLVSFLTGL GCPNCIEYFT SQGLQSIYHL QNLTIEDLGA LKIPEQYRMT 

_sr-p70c-att20 PSLVSFLTGL GCPNCIECFT SQGLQSIYHL QNLTIEDLGA LKVPDQYRMT 

_sr-p70a-att20 

Z 551 600 

sr-p70a-cos3 IWRGLQDLKQ GHDYGAAAQQ LLR.SSNAAA ISIGGSGELQ RQRVMEAVHF 

_ sr-p70b-cos3 

sr-p7 0-ht29 IWRGLQDLKQ GHDYS . TAQQ LLR.SSNAAT ISIGGSGELQ RQRVMEAVHF 

_sr-p70c-att20 IWRGLQDLKQ SHDCG. . .QQ LLRSSSNAAT ISIGGSGELQ RQRVMEAVHF 

_sr-p70a-att20 

~ 601 650 

_ sr-p70a-cos3 RVRHTITIPN RGGPGA. .GP DEWADFGFDL PDCKARKQPI KEEFTEAEIH 

_ sr-p70b-cos3 

sr-p70-ht29 RVRHTITIPN RGGPGG..GP DEWADFGFDL PDCKARKQPI KEEFTEAEIH 

_sr-p70c-atc20 RVRHTITIPN RGGAGAVTGP DEWADFGFDL PDCKSRKQPI KEEFTETESH 

_sr-p70a-att20 
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1 < I > 2 < | > 3 

I 1 MAQS. .TATSPDGGTTFEHLWSSLEPDSTYFDLPQSSRGNNEWGGTDSSMD 50 

I I i I Mill || || 

1 MEEPQSDPSVEPPLSQETFSDLWKLLPE NNVlJsPLPSQAMD 41 

51 VFHLEGMTTSVM^QFNLLSSTMDQMSSRAASASPYTPEHAASVPTHSPYA 10 0 

I I I i I I II I I 

42 DLML. . . SPDDIEQWFTEDPGPDEAPRMPEAAPPVAPAPAAPTPA. APAP 87 



ioi qpsstfdtmspapvipsntdypgphhfevtfqqsstaksatwi|yspllkk 15 0 

II II II I I I I II I I 111 1 1 I I 

8 8 APSWPL SSSVPSQKTYQGSYGFRLGFLHSGTAKSV TCljYSPA LNK 13 2 

151 L YCQ I AKTC P I Q I K VST P P P PGTAI RAMPVYXKAEHVTDWKRC PNH ELG 200 

II IMM I I Mill Ml II II II III II 

133 MFCQLAKTC P VQLWVDST PP PGTRVRAMAI YKQSQHMTEWRRC PHH E . . 180 
201 PJDFNEGQS^PASHLIRVEGNNLSQYVDDPVTGRQSVWPYEP^dvG'TEFT 2 50 

I M I M II II I III I I II 1 1 1 II II ill I 

181 RCSDSDGLJVPPQHLIRVEGNLRVEYLDDRNTFPiiSVWPYEPPEf/GSDCT 230 

< T >. « <■ ( > 8 . <^ > 7. 

251 T I L YNFMCNS SC VGGMNRR PILIIITL EMRDGQ VLGRRS FEGR I C AC PGR 3 00 

M I I I I I I I I I I I II II I I 1 1 1 1 1 I 1 1 I II I I II M II 

231 TIHYNYMCNSSCMGGMNRRPILTIITLEDSJGNLLGRNSFEVRVCACPGR 2 80 

3 01 drkadedhyreqqai^essakngaaskrAfkqIppavpalgagvkkrrhg 3 50 

II I I III I | 

2 81 DRRTEEENLRKKGEPHHELP . . PGSTKRALPNNTSSSPQ PKKKPL 3 23 

< ■ ■ > 10 < * .>9 < ■ ,> 11 

3 51 DEDTYYLQpRGRENFEILMKLKESLELMELVPQPLVDSYRQQQQLLQRPS 4 00 

I ll,MM II I I III I I 

324 DGEYFTLQtRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKK 3 73 
401 HLQPPSYGPVLSPMNKVHGG^INKLPSWQLVGQPPPHSSAATPNLGPVGP4 5 0 
374 GQSTSRHKKLMFKTEGPDSD 393 
451 GMLNNHGHAVPANGEMSSSHSAQSMVSGSHCTPPPPYHADPSLVSFLTGL 5 00 



-^14 



501 GCPNCIEYFTSQGLQSIYHLQNLTIEpLGALKIPEQYRMTIWRGLQDLKQ 550 
551 GHDYSTAQQLLRSSNAATISIGGSGELQRQRVMEAVHFRVRHTITIPNRG 600 
601 GPGGGPDEWADFGFDLPDCKARKQPIKEEFTEAEIH 636 
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sr-p70d-imr32 
sr-p70a-ht29 



CG ACCTTCCCCA GTCAAGCCGG GGGAATAATG 32 
CG ACCTTCCCCA GTCAAGCCGG GGGAATAATG 150 



AGGTGGTGGG CGGAACGGAT TCCAGCATGG ACGTCTTCCA CCTGGAGGGC 82 
AGGTGGTGGG CGGAACGGAT TCCAGCATGG ACGTCTTCCA CCTGGAGGGC 2 00 

ATGACTACAT CTGTCATGCA TCCTCGGCTC CTGCCTCACT AGCTGCGGAG 132 



CCTCTCCCGC TCGGTCCACG CTGCCGGGCG GCCACGACCG TGACCCTTCC 182 



CCTCGGGCCG CCCAGATCCA TGCCTCGTCC CACGGGACAC CAGTTCCCTG 23 2 



GCGTGTGCAG ACCCCCCGGC GCCTACCATG CTGTACGTCG GTGACCCCGC 282 



ACGGCACCTC GCCACGGCCC AGTTCAATCT GCTGAGCAGC ACCATGGACC 332 
GGCCC AGTTCAATCT GCTGAGCAGC ACCATGGACC 2 52 

AGATGAGCAG CCGCGCGGCC TCGCCCAGCC CCTACACCCC AGAGC ACGC C 382 
AGATGAGCAG CCGCGCGGCC TCGGCCAGCC CCTACACCCC AGAGC ACGC C 302 

GCCAGCGTGC CCACCCACTC GCCCTACGCA CAACCCAGCT CCACCTTCGA 43 2 
GCCAGCGTGC CCACCCAcTC GCCCTACGCA CAACCCAGCT CCACCTTCGA 3 52 

CACCATGTCG CCGGCGCCTG TCATCCCCTC CAACACCGAC TACCCCGGAC 482 
CACCATGTCG CCGGCGCCTG TCATCCCCTC CAACACCGAC TACCCCGGAC 402 

CCCACCACTT TGAGGTCACT TTCCAGCAGT CCAGCACGGC CAAGTCAGCC 532 
CCCACCACTT TGAGGTCACT TTCCAGCAGT CCAGCACGGC CAAGTCAGCC 452 

ACCTGGACGT ACTCCCCGCT CTTGAAG 
ACCTGGACGT ACTCCCCGCT CTTGAAG 



ATGACTACAT CTGTCAT. 



217 
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Docket No. IVD913 

DECLARATION AND POWER OF ATTORNEY FOR 
UNITED STATES PATENT APPLICATION 



X Original 



Supplemental 



Substitute 



As a below-named inventor, I hereby declare that: 

My residence, citizenship and post office address are given below under my name. 

I believe I am an original, first and joint inventor of the subject matter which is claimed and 
for which a patent is sought on the invention entitled: 

Purified SR-p70 protein 

the specification of which 

is attached hereto. 

X was filed on as United States 

Application Serial No. 

and was amended on (if applicable). 

was filed on February 03, 1997 as PCT International 



and was amended under PCT Article 19 on September 02, 1997 (if applicable). 

I have reviewed and understand the contents of the above-identified specification, including 
the claims, as amended by any amendment specifically referred to above. 

I acknowledge my duty to disclose information of which I am aware which is material to the 
examination of this application in accordance with Section 1.56 of Title 37 of the Code of Federal 
Regulations. 

I hereby claim foreign priority benefit under Section 119 (a) - (d) of Title 35 of the United 
States Code of any foreign application(s) for patent or inventor's certificate or of any PCT 
application(s) designating at least one country other than the United States identified below and also 
identify below any foreign application(s) for patent or inventor's certificate or any PCT application(s) 
designating at least one country other than the United States filed by me on the same subject matter 
and having a filing date before that of the application(s) from which priority is claimed: 



Application No. 



PCT/FR97/00214 



Priority Claimed 



Country 



Number 



Filing Date 



Yes No 



FRANCE 



96 01309 



February 02, 1996 



X 
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I hereby claim benefit under Section 120 of Title 35 of the United States Code of any United 
States application(s) or PCT application(s) designating the United States identified below and, 
insofar as the subject matter of each of the claims of this application is not disclosed in said prior 
application(s) in the manner provided by the first paragraph of Section 1 12 of Title 35 of the United 
States Code, I acknowledge my duty to disclose material information of which I am aware as defined 
in Section 1.56 of Title 37 of the Code of Federal Regulations which occurred between the filing date 
of the prior application(s) and the national or PCT filing date of this application: 

Application Serial No. Filing Date Status 



I hereby appoint Mary P. Bauman, Reg. No.JJ^926j_Michael D. Alexander, Reg. No. 36^080^ 
and Paul E. Dupont, Reg. No. 27,438, or any of them my attorneys or agents with full power of 
substitution and revocation to prosecute this application and to transact all business in the Patent and 
Trademark Office connected therewith. 

SEND CORRESPONDENCE TO: DIRECT TELEPHONE CALLS TO: 

Patent Department,, MICHAEL D. ALEXANDER 

Sanofi Pha rmaceuticals, Inc. 
9 Great Valley Parkway 

£qZB£Ql3Q26__. ~~ Telephone No. (610) 889-8802 



I hereby declare that all statements made herein and in the above-identified specification of 
my own knowledge are true and that all statements made on information and belief are believed to be 
true; and further that these statements were made with the knowledge that willful false statements and 
the like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of 
the United States Code and that such willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 



Full name of first joint inventor Daniel CAPLEtl 




Residence 



Inventor's signature 



La Bousquiere, 31290 Ayjgnonet Lauragais, France /v^/' 




Date . 



Post Office Address 



La Bousquiere, 31290 Avignonet Lauragais, France 



Citizenship 



French 
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p& Full name of second joint inventor ,,Pascual FERRARA 

< ~^*~ Inventor's signature f f,^ -j?.J$ f'.ys*. iX'- Q. -- • ' 



Residence 



Libouille Saint- Assiscle, 3 1290 Avignonet Lauragais, France 



Post Office Address Libouille Saint-Assiscle, 31290 Avignonet Lauragais, France 
Citizenship France 



^3? " Full name of third joint inventor Ahmed Mourad KAGHAD 



Inventor's signature A - Kntinr.j fcfr (sVifrO /f Date - 

Residence 5, rue de la Poste, 3 1450 Montgiscard^France 

Post Office Address 5, rue de la Poste, 31450 Montgiscard, France 

Citizenship French 
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